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About This Manual 



This manual describes the functions of the IBox in the VAX 9000 family system. It is 
a reference manual for Customer Services personnel as well as a training resource for 
Educational Services. 

Intended Audience 

The content, scope, and level of detail in this manual assumes that the reader: 

• Is familiar with the VAX architecture and VMS operating system at the user level 

• Has experience maintaining midrange and large VAX systems 

Manual Structure 

This manual has six chapters, an appendix, a glossary of IBox terms, and an index. 
Chapter 1 contains an introduction to the IBox, describing the major features, functions, 
and physical organization of the unit. Chapter 2 describes the program counter Chapters 
3 4 and 5 provide a detailed description of the functions of the IBox, with each chapter 
emphasizing one of the three main pipeline stages of the unit. Chapter 6 is a summary 
of the errors that can be generated in the IBox. 

Appendix A provides a listing of the input and output signals of the three multichip units 
(MCUs) that comprise the IBox. 
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XIII 



1 

General Description 



This chapter provides an overview of the VAX 9000 family IBox, including descriptions of 
the major hardware features, physical characteristics, and interbox interfaces. 

1.1 IBox Introduction 

The IBox is an independent functional unit that fetches and decodes instructions and 
their specifiers from the MBox and passes them to the EBox for execution. The EBox is 
provided with data and control functions so that it can execute several instructions in one 
cycle. 

The IBox provides all required instruction data to the EBox: source, destination, PC, 
fork address, as well as pointers to the source data and destination. The instruction 
data is stored in the EBox source list or general-purpose registers (GPRs). In the case 
of a memory source operand, the IBox generates the operand address and passes it to 
the MBox requesting a memory read operation. The MBox prefetches the data and then 
writes it into the EBox source queue. 

Using the source pointers provided by the IBox, the EBox accesses its source list or GPRs 
and executes the instruction. Depending on the destination, the EBox writes the results 
to a GPR or transfers the results to the MBox to be subsequently written to memory. In 
effect, the EBox deals only with data (no opcodes or operand specifiers). 

In addition, the IBox can decode and store several instructions ahead of the EBox. 
However, given the capabilities of the EBox and MBox, it is difficult for the IBox to 
remain several instructions ahead. 

1 .2 Basic Hardware Implementation 

This section describes the major new hardware implementations incorporated into the 
IBox. Figure 1-1 shows the basic IBox block diagram. 

1.2.1 Program Counter Unit 

The program counter unit (PCU) is responsible for directing the I-stream that the IBox 
will decode, and for generating the PC address used by the EBox and MBox. The PCU 
provides the control logic for the two IBox caches: virtual instruction cache (VIC) and 
the branch prediction cache (BPC). In addition, the PCU controls the secondary branch 
prediction mechanism. 
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Figure 1-1 Basic IBox Block Diagram 
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1.2.2 Virtual Instruction Cache 

The virtual instruction cache (VIC) is a virtually addressed, 8 Kbyte, direct-mapped, 
one-way associative cache that reduces the number of I-stream requests issued to the 
MBox. By flushing the VIC on every REI, the IBox can ignore writes to memory. Having 
a virtually addressed instruction cache eliminates the need for address translation and 
translation logic. Because the VIC has its own MBox request port, it is refilled from the 
MBox data cache, rather than memory. 

1.2.3 Instruction Buffer 

The IBox incorporates a 25-byte instruction buffer that latches the I-stream. The 

instruction buffer is partitioned into a 9-byte instruction buffer (IBUF), an 8-byte 

extended instruction buffer (IBEX), and an additional 8-byte extended instruction buffer 

(IBEX2). 

The content of the nine IBUF bytes are latched, decoded, and shifted. The remaining 16 

bytes of IBEX and IBEX2 contain additional prefetched I-stream from the VIC, which is 

passed to the IBUF as required. 

1.2.4 Branch Prediction 

When branch instructions are encountered in the I-stream, the IBox predicts the direction 
the I-stream will follow (redirects the I-stream by taking a branch or continues decoding 
sequential I-stream by not taking the branch). When initially encountering a branch, 
the IBox decides to take or not take the branch by accessing the logic that contains fixed 
predictions for each branch instruction. Branch predictions are validated by the EBox 
and the correct predictions are stored in a branch prediction cache that is accessed when 
the same branch is subsequently encountered. 

1.2.4.1 Branch Prediction Cache 

To minimize the idle time spent flushing and refilling the pipeline after every branch 
instruction, the instruction decode stage includes a branch prediction cache (BPC). This 
IK virtual cache increases performance by storing information about the branch validity, 
and the target address. As a branch instruction is being decoded, it is referenced, in 
parallel, in the BPC and a prediction is made whether to take the instruction. The IBox 
uses the cached target address to redirect the instruction fetch stage to the new I-stream 
if the branch is taken. For performance reasons, the BPC is never flushed. 

1.2.5 Multiple Specifier Decode Unit 

The multiple specifier decode unit (XBAR, or "crossbar") is implemented as a set of 
multiplexers that provide the capability of simultaneously decoding up to three operand 
specifiers. The I-stream is presented to the XBAR nine bytes at a time from the IBUF. 

The actual number of specifiers decoded depends on the specifier type. The XBAR can 
decode up to three specifiers (for example, two simple specifiers and one complex specifier, 
or three simple specifiers). Simple specifiers are considered register mode or short literal, 
while all other specifiers are considered complex. 
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1.2.6 Specifier Handlers 

The specifier handlers are in the operand processing unit (OPU). The logic is comprised 
of three dedicated specifier evaluation units: 

Complex specifier unit 
Short literal unit 
Free pointer logic 

In addition, the OPU maintains a set of GPRs. The GPRs are implemented in self-timed 
register (STREG) files that provide multiported, read/write access capabilities. 

The OPU maintains a virtually addressed MBox request port. 

1.2.6.1 Complex Specifier Unit 

The complex specifier unit (CSU) is responsible for evaluating complex specifiers. 
The XBAR supplies the CSU with the register, register index, and up to 32 bits of 
displacement. The CSU calculates branch target addresses, immediate operands, and 
memory addresses of operands supplied to the EBox from the MBox. 

The CSU contains the OPU port interface to the MBox and the interface to the EBox. 
Operand addresses are sent to the MBox, while the EBox receives the immediate 
operands directly from the CSU. 

The CSU also contains the IBox GPRs. The GPRs are read and written by the CSU, and 
written by the EBox. 

1.2.6.2 Short Literal Unit 

The short literal unit (SLU) receives decoded short literal specifiers from the XBAR and 
expands them for entry into the EBox source list. The SLU can produce a single longword 
of expansion each cycle. The literal expansion depends on the specifier data type. 

1.2.6.3 Free Pointer Logic 

The free pointer logic (FPL) manages pointers into the EBox source list. The FPL tracks 
the available (free) source list addresses and associated pointers. The FPL establishes 
the correct source 1 and source 2 pointers for operands the EBox will use to execute an 
instruction. The FPL also generates the correct destination pointer for an instruction 
result. 
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1.2.7 Read/Write Scoreboards 

The IBox processes specifiers while the EBox is executing previously decoded instructions. 
The IBox must be prevented from performing an address calculation that depends on the 
result of a currently executing instruction. This is accomplished by recording the register 
numbers to be written by the EBox, and by matching those numbers against any register 
selected for use in an operand specifier. When a match occurs, the CSU stalls and waits 
for the instruction to be completed before calculating the operand address. 

The XBAR generates the read and write masks for conflict checking in the OPU. The 
masks represent GPR through 14. The masks prevent reading or writing a GPR that is 
scheduled to be modified or read by the EBox. 

The CSU maintains the read and write register scoreboards for up to six instructions in 
the pipeline. Both scoreboards contain 15-bit registers, representing GPR through 14. 

The read scoreboard tracks the GPRs designated to be read by a previously decoded 
instruction. These GPRs cannot be written by the OPU for autoincrement and 
autodecrement until their corresponding instructions are completed by the EBox. 

The write scoreboard tracks GPRs designated as destinations by instructions in the 
pipeline. These GPRs cannot be used by the OPU for address calculations until the 
instruction is completed by the EBox. 

1 .3 Physical Organization 

The IBox logic is physically contained in three multichip units (MCUs) (Figure 1-2). This 
section introduces each MCU and its related macrocell arrays (MCAs). 
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Figure 1-2 Planar Module Layout 
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1.3.1 VICMCU 

The VIC MCU is comprised of two MCAs and forty-two IK x 4-bit STRAMs. Two bit- 
slices of the program counter and the data STRAMs for branch prediction and VIC are 
resident in this MCU. Figure 1-3 shows the MCU content. 

The following list introduces the MCA and STRAM functions of the VIC MCU: 

• PCBP MCA — The program counter/branch prediction control MCA contains bits 
[07:00] of the PC and provides branch prediction control. 

• PCVC MCA — The program counter/VIC control MCA contains bits [15:05] of the PC 
and provides VIC control. 

• VIC data STRAMs — These eighteen IK x 4-bit STRAMs are dedicated to the VIC 
data and associated byte parity. 

• Branch prediction STRAMs — These twenty-four IK x 4-bit STRAMs are 
dedicated to the branch prediction function. The STRAMs store the branch PC 
tag, prediction PC, branch instruction length, and prediction bit. 
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Figure 1-3 VIC MCU Content 
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1.3.2 XBRMCU 

The XBR MCU contains the instruction buffer, XBAR, two of the four PC slices, and tag 
STRAMs for the VIC. Figure 1-4 shows the MCU content. 

The following list introduces the MCA and STRAM functions of the XBR MCU: 

• PCLO MCA — The program counter low MCA contains bits [23:13] of the PC. 

• PCHI MCA — The program counter high MCA contains bits [31:24] of the PC. 

• IBFA MCA — The instruction buffer A MCA contains the low-order nibble of the 
instruction buffer. IBFA also provides the VIC hit logic. 

• IBFB MCA — The instruction buffer B MCA contains the high-order nibble of the 
instruction buffer. Parity checking is performed by combining IBFB parity with a 
partial parity from IBFA. 

• XDTA MCA — The XBAR data A MCA contains the low-order nibble of the XBAR. 
The major MCA outputs are displacements for the OPU. 

• XDTB MCA — The XBAR data B MCA contains the high-order nibble of the XBAR 
data path. 

• XSCA MCA — The XBAR control MCA is the XBAR control unit. It receives I- 
stream data from the IBUF and performs some simple instruction decoding. The 
instruction buffer shift control is generated from the number of specifiers decoded 
and the number of specifiers the instruction contains. 

• VICT STRAMs — Five of the nine IK x 4-bit STRAMs contain the 19-bit VIC tags. 
Two of the STRAMs provide the 4-bit VIC quadword valid fields and associated parity 
bits and parity for the VIC quadword valid bits. The remaining two provide the VIC 
block valid field. 
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Figure 1-4 XBR MCU Content 
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1.3.3 OPUMCU 

The OPU MCU contains the logic responsible for the specifier decode process. The 
operand port interface to the MBox also resides in this MCU. The MCU also contains a 
pair of self-timed registers (STREGs) that provide the IBox GPRs. Figure 1-5 shows the 
MCU content. 

The following list introduces the MCA and STRAM functions of the OPU MCU: 

• OPUx MCA — The OPUA and OPUB MCAs provide the data path for the complex 
specifier unit (CSU). (OPUA provides the low-order word; OPUB provides the high- 
order word.) The CSU receives up to 32 bits of displacement from the XBAR, operand 
data from the MBox, and result data from the EBox and directs the outputs to the 
MBox, EBox, or a loopback to the CSU. 

• OSQA MCA — This MCA provides control for the GPR STREGs. The OPUA and 
OPUB multiplexers (AMUX, BMUX) are also provided by this MCA. This MCA also 
controls the EBox and OPU port interfaces and stall logic. 

• OSQB MCA — This MCA receives short literal, source, and destination data from 
the XBAR. Short literal specifiers are expanded into the correct context and passed to 
the EBox. The source and destination pointers are also passed to the EBox. 

• OCTL MCA — The operand control MCA maintains the read/write scoreboard and 
directs flush signals to other appropriate IBox functional units. 
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Figure 1-5 OPU MCU Content 
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1.4 Pipeline Overview 

The IBox consists of three main pipeline stages that closely relate to the MCU physical 
structure. Each stage can generate indicators to aid in isolating errors to an MCU FRU. 
The three pipeline stages are denned as follows: 

• Instruction fetch — This stage consists of the logic involved in fetching the I-stream 
before it is latched in the instruction buffer and includes these components: 

VIC 

Program counter logic (PCU) 
IBEX, IBEX2, and part of the IBUF 
IBUF-to-MBox interface 

• Instruction decode and branch prediction — This stage consists of the logic 
involved in decoding the instruction in the IBUF and includes these components: 

XBAR 

Branch prediction unit (BPU) 

• Specifier evaluation — This stage consists of the OPU logic involved in the 
evaluation of operand specifiers and includes these components: 

Complex specifier unit (CSU) 
Branch prediction logic 
Free pointer logic (FPL) 
Short literal unit (SLU) 
Operand control unit (OCTL) 
OPU-to-MBox interface 
OPU-to-EBox interface 

Because the IBox is pipelined, when a stage cannot complete its operation, previous 
stages must be stalled. That is, previous stage operations may be suspended. 

Figure 1-6 through 1-9 provide examples of basic pipeline flow. 
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Figure 1-6 shows the IBox pipeline decoding sequential instructions where no stalls 
are encountered in any of the pipeline stages. The following events occur in each of the 
machine cycles of this example: 

• During the first machine cycle (TO), the IBUF is loaded with the I-stream to be 
presented to the decode units of the IBox. 

• During the second cycle (Tl), the logic representing the decode pipeline stage decodes 
the opcode and the specifier bytes in the IBUF, and passes them to the specifier 
handlers. The logic representing the fetch stage replenishes the decoded bytes of the 
IBUF. 

• During the third cycle (T2) of this example, the decoded specifiers are processed by 
the specifier handlers and passed to the EBox. The decode logic and the fetch logic 
continue to perform their operations simultaneously. 

The parallel operations of the three pipeline stages continue until one of the units stalls 
or until the IBox is flushed to a new I-stream. A stall in one of the units is caused by 
instruction specifiers that require more than a single cycle to be processed or by conflicts 
in the instructions (for example, an instruction that contains two sequential complex 
specifiers and the first one requires more than a single cycle to be processed in the 
specifier pipeline stage). 
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Figure 1-6 IBox Pipeline: No Stalls 
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Figure 1-7 shows the IBox pipeline when the processing of a specifier requires more than 
a single cycle and a subsequent specifier requires processing by the same specifier 
handler. This stall occurs when the CSU is processing an autodecrement-deferred 
indexing mode specifier and another complex specifier is decoded by the XBAR. The 
following events occur in this example of the IBox pipeline: 

• During the first machine cycle (TO), the IBUF is loaded with the I-stream to be 
presented to the decode units of the IBox. 

• During the second machine cycle (Tl), the decode unit decodes an autodecrement- 
deferred indexing mode specifier and the fetch unit replenishes the decode units with 
I-stream. 

• During the third machine cycle (T2), the CSU begins processing the autodecrement- 
deferred indexing mode specifier, the decode unit decodes a complex specifier and 

is replenished by the buffers representing the fetch stage of the IBox pipeline. 
Because the autodecrement-deferred indexing mode specifier requires multiple cycles 
to process and a subsequent complex specifier is to be processed by the CSU, the 
IBox pipeline stalls and cannot resume operations until the first complex specifier is 
processed. The autodecrement-deferred indexing mode specifier requires a minimum 
of four cycles to be processed. 
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Figure 1-7 IBox Pipeline: CSU Stall 
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Figure 1-8 shows the pipeline when a branch is taken, but is predicted by the secondary 
prediction mechanism. The following events occur when a branch is predicted by the 
secondary prediction mechanism: 

• During the second cycle (Tl) of this example, a branch is decoded by the XBAR 
The BPC is accessed, but no information regarding this branch is valid. The branch 
displacement is passed to the CSU to calculate the branch target address. 

• During the third cycle (T2), the secondary branch prediction logic is accessed and the 
branch is predicted taken. The branch target PC calculation is completed by the CSU 
and the PC is passed to the fetch logic the next cycle. 

• During the fourth cycle (T3), the fetch logic requests the data at the address supplied 
by the branch target PC to be loaded into the instruction buffer. 
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Figure 1-8 IBox Pipeline: Branch Taken, Secondary Prediction 



Figure 1-9 shows the IBox pipeline during the execution of a branch instruction. The 
branch, in this case, is stored in the BPC and the history bit indicates that the branch is 
to be taken. The BPC lookup requires one extra cycle to predict the branch and load the 
correct target PC. 
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Figure 1-9 IBox Pipeline: BP Cache Hit, Prediction Taken 
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1.5 IBox Interfaces 

The IBox interfaces to each of the other CPU functional units through dedicated 
interfaces (or ports). This section describes each interface. 

1.5.1 MBox Interface 

The IBox interfaces to the MBox through two ports in the MBox: 

• Instruction buffer port — This port requests data from the MBox when a miss is 
encountered in the VIC. 

• Operand processing unit (OPU) port — This port requests memory-related 
source operands from the MBox and passes operand addresses that specify memory 
destinations to the MBox. 

1.5.1.1 Instruction Buffer Port 

The instruction buffer port is a read-only port to the MBox. The IBox uses the instruction 
buffer port to issue requests to the MBox for I-stream data — 64 bits at a time with byte 
parity. The I-stream is retrieved from the MBox cache. In the case of a cache miss, 
the request is forwarded to memory through the system control unit (SCU). Typically, 
a request is for four (aligned) quadwords to fill the VIC. Requests are initiated by the 
instruction buffer on a VIC miss. 

1.5.1.2 OPU Port 

The OPU port is a read/write operand access port to the MBox. The port has a 32-bit 
wide data path with byte parity. Any rotation of the data (justification) is performed by 
the MBox. The port provides the following functions: 

Operand prefetch from cache or memory on behalf of the EBox and VBox 

Queuing of addresses for operands destined to cache and memory 

Prefetching operands that are address deferred (indirect) from cache or memory 

The IBox issues requests on behalf of the EBox and VBox for operands that come from 
memory. The operands are passed directly from the MBox to the EBox source list. 

The IBox sends the destination address to the MBox. In turn, the MBox performs a 
translation buffer (TB) lookup and stores the physical address in the write queue. The 
MBox then waits for the result data from the EBox. 

For deferred addressing, the MBox returns the address of the operand to the IBox for a 
successive fetch for the data (operand). The data operand is returned to the EBox. 
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1.5.2 EBox Interface 

The IBox interfaces to the EBox through the queue functional unit port in the EBox. This 
unit contains a set of FIFO buffers (queues) that accept instruction control information 
and operands from the IBox. 

1.5.2.1 I Box-to-E Box Interface 

This interface is used to send operands to the EBox 32 bits at a time with byte parity and 
their respective pointers. These operands are handled by the OPU within the IBox (sign 
or zero-extended data, integer and floating short literal operands, immediate mode data). 
The source and destination pointers are maintained in queues within the EBox, and they 
allow the EBox to access the correct data. 

In addition, control information is passed to the EBox for microcode control, RLOG 
information, program count, and errors. 

1.5.2.2 EBox-to-lBox Interface 

This interface is used to transfer: 

Result data 

A starting or flushing PC 

RLOG unwind data 

Control information (branch valid, queue full, keep masks) 

This interface transfers 32-bit EBox result data (including byte parity) to the IBox. The 
result data is generally an operand that has a destination of a GPR. The data, including 
the byte parity, is written to the IBox GPR set. 

In addition to the various control signals, the EBox result data may also be of a 
controlling function, for example, an address passed to the IBox to initiate instruction 
fetch. 

1.5.3 VBox Interface 

The VBox issues requests for operands through the IBox. It sends the address (30 
bits) with byte parity and control data. Control data is the reference data size, type of 
reference (read or write), and whether it is a block read. 

1.5.4 Service Processor Interface 

The service processor unit (SPU) interfaces through the scan latches throughout the IBox 
data and control paths, and error logic. The SPU can retrieve and store the state of the 
IBox scan logic for error reporting and possible recovery. Field Service can then perform 
analysis on the stored error symptom data for identification of the failing FRU. 

The scan paths can implement symptom-directed diagnosis (SDD) as well as test-directed 
diagnosis (TDD). The scan latches have scan data/clock inputs and the usual system 
data/clock inputs. The scan inputs can be controlled by the SPU and diagnostics to shift 
test patterns in and test for the correct pattern that is subsequently retrieved. 
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This chapter describes the functions of the IBox program counter unit (PCU). It identifies 
the major PCs that control the IBox and the MCAs that comprise them. This chapter 
also provides a detailed explanation of the PC parity coverage and the construction of the 
32-bit PCs across four MCAs. 

2.1 Overview 

The PCU directs the flow of the I-stream through the IBox and provides control to the 
branch prediction cache (BPC) and the virtual instruction cache (VIC). The PCU also 
directs the EBox and MBox in instruction fetch and instruction execution by supplying 
a copy of the PC to the EBox and by providing an address to the MBox on a VIC refill 
operation. 

2.2 Major PCs 

The IBox generates four major PCs: 

Prefetch PC 
Decode PC 
Branch PC 
Unwind PC 

The primary inputs to the PCU are from the EBox, the OPU MCU, and the branch 
prediction cache (BPC). The EBox input (EBOX_RESULT_H[31:00]) directs the IBox to 
begin decoding a new I-stream as the result of a flush. The OPU input (OPUJRESULT. 
H[31:00]) provides the branch target address when a branch in the I-stream is not stored 
in the BPC. The target address is calculated by the complex specifier unit (CSU) and 
loaded into the decode PC (DECODE_PC_H[31:00]). When a branch stored in the BPC is 
decoded, BP_PREDICTION_PC_H[31:00] is loaded into the prefetch PC. 

Figure 2-1 is a block diagram of the PCU data path. 

2.2.1 Prefetch PC 

The prefetch PC is 32 bits wide and is used during the instruction fetch stage of the 
pipeline. This PC is the address of the next quadword that the instruction buffer receives. 
This address is used to request data from the VIC. When a cache miss is encountered, 
this address is the address of a Tequest to the MBox. 

Each cycle, the prefetch PC is incremented by a quadword until the buffers of the 
instruction buffer are completely full. While the instruction buffer is full, the prefetch PC 
is held. The PC begins incrementing again when IBEX2 is empty. 
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Figure 2-1 PCU Data Path 
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When there is a request for I-stream, the prefetch PC is used as the address to the MBox. 
All requests for I-stream are quadword aligned, so it is not necessary to send bits 
through 2 of the PC to address the MBox data cache. 

On a flush, the EBox provides the new PC to the IBox. The IBox then uses the PC 
provided to address the VIC and to address the MBox on a VIC miss. 

2.2.2 Prefetch PC Data Path 

The prefetch PC is loaded from the target PC, BP prediction PC, incremented PC, or 
MTAG. When a target PC is selected as the prefetch PC, it redirects the IBox from 
decoding sequential I-stream. The target PC is loaded from the: 

EBox during a flush 

OPU as a calculated branch PC 

Unwind PC when a branch prediction is incorrect 

BP_PREDICTION_PC_H[31:00] is selected as the prefetch PC when the branch 
prediction cache is supplying the PC of the branch being decoded. 

MTAG is loaded into the prefetch PC on each cycle of an MBox response. The MTAG is 
the memory tag of the returning data from the MBox. MTAG is incremented by eight 
each cycle of the MBox response and, when the response is complete, is loaded into the 
prefetch PC. The prefetch PC gets loaded with the MTAG because, during the MBox 
response, the prefetch PC can increment beyond the next quadword it is to receive. 

The prefetch PC outputs control the VIC and provide the addresses of requests to the 

MBox. 

Figure 2-2 describes the data path of the prefetch PC. The prefetch PC is selected from: 

• MTAG — At the completion of a VIC refill, MTAG is loaded into the prefetch PC to 
ensure that the prefetch PC points to the next quadword loaded into the instruction 
buffer. 

• Target PC — A target PC from one of the sources described in Section 2.2.3 can be 
loaded into the prefetch PC to redirect the I-stream. 

• Incremented PC — This input increments the prefetch PC by a quadword. 
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Figure 2-2 Prefetch PC Data Path 
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2.2.2.1 Prefetch PC Control 

The prefetch PC is incremented each cycle until the instruction buffer informs the PCU 
that IBEX and IBEX2 are full. When these units are full, the prefetch PC is held until 
IBEX2 again requires an I-stream. This continues until a request is made for an I-stream 
that is not in the VIC or a new target PC redirects the prefetch PC from sequential 
decoding. 

2.2.3 Target PC 

The target PC is loaded into both the prefetch PC and the decode PC to redirect the 
I-stream from decoding sequential instructions. A branch in the I-stream or a flush loads 
a new target PC into the decode PC and prefetch PC. 

Figure 2—3 describes the sources of the target PC. 

TARGET_PC_SELECT_H[01:00] is controlled by the PCU microcode and selects an input 
to redirect the I-stream for each of the following conditions: 

• Prediction PC is selected when a branch that is being decoded is stored in the BPC 
(BP hit). 

• OPU target PC provides the target PC when a branch is predicted taken but not 
stored in the BPC (BP miss). 

• EBox result provides the target PC when the EBox directs the IBox to flush to a new 
I-stream. 

• Unwind PC is loaded into the target PC after a bad branch prediction. 
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Figure 2-3 Target PC Sources 
2.2.4 Decode PC 

The decode PC is the PC of the instruction whose opcode is in the opcode byte (byte 0) 
of the instruction buffer. That is, it is the PC of the instruction currently being decoded. 
The decode PC does not reflect intermediate shift counts that result from partially 
decoded instructions. The decode PC is used in the instruction specifier stage of the IBox 
and is passed to the EBox to be placed in the PC history buffer. In the EBox it is called 
the VAX PC because it is the PC of the instruction to be executed and is used by the 
EBox when handling exceptions. 

This PC is also used by the CSU to evaluate implied specifiers and branch displacements. 
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The decode PC is loaded from two sources. The next or sequential PC provides the 
decode PC when instruction decode is following a sequential path (straight line) and 
when branches are encountered that are predicted not taken. When a branch is predicted 
taken or a flush is encountered, a new (nonsequential) PC is loaded into the decode PC. 
This PC is supplied by the OPU or the branch prediction cache on branch predictions or 
is supplied by the EBox on a flush. 

2.2.5 Decode PC Data Path 

The decode PC is loaded from the next PC, when decoding sequential I-stream, or is 
loaded from the target PC. The target PC is loaded into the decode PC at the same time 
a new target PC is loaded into the prefetch PC. The next PC is generated each cycle by 
adding the number of bytes the XBAR decodes to the decode PC. When the instruction 
under decode is completely decoded, this addition yields a new decode PC. 

The decode PC is output to the EBox (IBOX_PC_H[31:00]) to be placed in a queue until 
the instruction is executed. The CSU of the OPU receives a copy of the decode PC (PC_ 
OPU_DECODE_PC_H[31:00]) to identify the instruction that is being decoded. When a 
branch instruction is being processed, and the branch meets the criteria required to be 
written in the branch prediction cache, the decode PC is written to that cache. The BP 
tag is written with DECODE_PC_H[31:10] at the address supplied by DECODE_PC_ 
H[09:00]. 

Each cycle, the decode PC is added to the number of bytes decoded by the XBAR (XBAR_ 
SHIFTCOUNT_H[03:00]). When all of the specifier bytes of an instruction are decoded 
(XSCA_SHIFTOPCODE_H asserted), the next sequential PC or next PC is loaded as the 
decode PC. 

Each new decode PC that is loaded asserts IBOX_PC_H[31:00] in the EBox and OPU_ 
DECODE_PC_H[31:00] in the OPU MCU. The EBox copy of the decode PC is stored in a 
queue for instructions to be executed. The OPU copy of the decode PC is used to calculate 
specifier PCs and branch displacement. 



TARGET PC 



XBAR SHIFT COUNT 



LATCH 



DECODE PC 



XSCA_SHIFTOPCODE_H 



MR_XD053_8B 



Figure 2-4 Decode PC Data Path 

2.2.5.1 Delta PC 

The XBAR shift count (XSCA_SHIFTCOUNT_H[03:00]) is accumulated to produce the 
decode delta. That is, the total number of bytes in an instruction yields the decode delta. 
The decode delta provides the OPU and the BPC with the instruction length. 
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2.2.6 Branch PC 

The branch PC is the address of the branch instruction that is currently under 
evaluation. When the IBox encounters a branch, the decode PC is loaded into the branch 
PC. The branch PC is used to address the branch prediction cache. The branch PC is 
compared with the branch prediction (BP) tag to produce a hit or miss in the branch 
prediction cache. 

2.2.7 Branch PC Data Path 

The branch PC is loaded from either the decode PC or the second branch PC. When a 
branch instruction is being processed, the address of the branch is saved until the branch 
is shifted out of the instruction buffer. After the branch is shifted out of the instruction 
buffer, the branch PC provides the information pertaining to the branch that is written to 
the branch prediction cache. 

Because the IBox can process two branches simultaneously, a second branch PC must 
be generated for the second branch. The second branch PC (SECOND_BRANCH_PC_ 
H[31:00]) is stored in a scan latch until the first branch is completely processed. The 
second branch PC is then loaded into the branch PC and used as data to be written to 
the branch prediction cache. 

Figure 2-5 describes the data path of the branch PC. The decode PC is loaded into the 
branch PC for each branch the IBox encounters. The address is saved until the branch is 
completely processed. If two branches are being processed, each address is saved. 



DECODE PC 



DECODE PC SECOND BRANCH PC 

LATCH 



BRANCH PC 



MR_XCD5*_B9 



Figure 2-5 Branch PC Data Path 
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2.2.8 Unwind PC 

The unwind PC is the PC of the direction not taken in a branch prediction. When the 
IBox makes a branch prediction, two PCs relate to the prediction: the branch target PC 
and the next or sequential PC. When a branch is predicted taken, the branch target PC 
or the prediction PC is the path that the instruction execution follows. The next PC is the 
path that is not taken. The IBox saves the path that is not taken (in this case, the next 
sequential PC) in case the IBox predicted the branch incorrectly. If the EBox informs the 
IBox that a branch has been incorrectly predicted, the IBox can correct the prediction by 
loading the unwind PC into the decode and prefetch PCs and start decoding in the correct 
path. 

Because the IBox can process two branches simultaneously, it is necessary to have two 
unwind PCs. These PCs are called the unwind PC and the second unwind PC. These 
PCs are both used when a branch is predicted and a second branch is predicted before 
the first has been validated by the EBox. Once the first branch is validated, the second 
unwind PC is loaded into the first unwind PC. The original, first unwind PC is discarded. 

2.2.9 Unwind PC Data Path 

When the IBox predicts a branch, the PC of the instruction that will not be executed is 
saved as the unwind PC. In the event of a wrong prediction, the PCU can return to the 
PC that was saved and continue with the correct I-stream. 

The PCU can maintain information pertaining to two conditional branches, therefore, 
two unwind PCs are stored. Figure 2-6 describes the data path of the unwind PC. The 
unwind PC is loaded from one of three sources: 

• Next PC — This PC is loaded when a branch is predicted taken. 

• OPU result — This PC is loaded when a branch is predicted not taken. 

• Second unwind PC — This PC is loaded by latching the next PC and loading it as 
the unwind PC. 

The second unwind PC is loaded into the unwind PC only after the first branch that has 
been predicted has been validated by the EBox. 



NEXT PC 



OPU RESULT 



SECOND UNWIND PC 



UNWIND PC 



MR_X0055_89 



Figure 2-6 Unwind PC Data Path 
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2.3 Cache Control 

Control of the branch prediction cache (BPC) and the virtual instruction cache (VIC) 
is provided by the PCU. The VIC is addressed by the prefetch PC on read and write 
operations, the VIC tag is compared to the prefetch PC to determine a match when 
requesting data, and the prefetch PC provides the address for MBox requests on a VIC 
refill. The decode PC provides both the BP tag and the BP address. Figure 2—7 shows 
the relationship of the cache control signals to the PCU outputs. 
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Figure 2-7 PCU Cache Control 

2.3.1 BPC Control 

The branch PC provides BPC fields. The BP tag is bits 31 through 10 of the branch PC 
and is written when the branch is first encountered. The tag is written at the cache 
address that is supplied by decode PC bits through 9. When a branch is encountered, 
the virtual address of the branch (DECODE_PC_H[31:00]) is compared to the BP tag to 
determine a hit or miss in the BPC. 

2.3.2 VIC Control 

The prefetch PC provides control for the VIC. This PC provides the address of a request 
to the MBox (IBOX_IB_ADDRESS_H[31:03]), the VIC tag (VIC_TAG_H[31:13]), and to 
address the VIC when replenishing the instruction buffer. 

Bits 31 through 13 provide the VIC tag. The tag is written on a VIC refill and 
subsequently compared with bits 31 through 13 of the prefetch PC when the instruction 
buffer is requesting VIC data. 

Bits through 9 of the prefetch PC provide the cache address on VIC requests and, when 
requesting data from the MBox data cache bits 31 through 3 of the prefetch PC, provides 
the requested address (IBOX_IB_ADDRESS_H[31:031). 
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2.4 PCU Error Detection 

Each of the MCAs that comprise the PCU receives inputs from most of the other 
functional units in the IBox and from the MBox and EBox. Each PCU MCA contains 
parity detection logic that detects errors in the internally generated signals and inputs to 
the PCU. Errors detected in the PCU MCAs result in both fetch and decode errors being 
asserted in the IBox and EBox. 

This section contains block diagrams of the error logic of each PCU MCA and lists the 
decode and fetch errors that can occur in each MCA 

2.4.1 PCVC Error Logic 

PCVC fetch errors are latched in IBFB as an IBFB_FETCH_ERROR and then asserted 
in the EBox as IBOX_FORK_ERROR_H. During this process, the scan latches in PCVC 
receive a hold signal. The hold signal preserves the state of the error scan latches so that 
the intermediate error can be determined. 

PCVC decode errors are latched in XDTB and OSQB as XDTB_DECODE_ERROR_H. 
OSQB asserts IBOX_POINTER_ERROR_H and DATA_ERROR_H in the EBox. 

Figure 2-8 shows a block diagram of the PCVC error logic and Table 2-1 lists the error 
signals generated in this MCA. 



Table 2-1 PCVC Errors 



Input Signal 
Fetch Errors 



DECODE_PC[ 15:08] 
PREFETCH_PC_H[ 15:08] 
PCBP_DECODE_PC[05:00] 
BP_PREDICTION 
BP_PREDICTION_TAG_H[ 1 5:00] 



Decode Errors 



Intermediate Error 



PCVCJDECODE_PC_ERROR 

PCVC_PREFETCH_PC 

PCVC_PCBP_DECODE_PC_ERROR 

PCVC_BP_PREDICTION_ERROR 

PCVC_BP_PRED_TAG_ERROR 



XSCA_XSCA_SHIFTCOUNT_H[03:00] 
XSCA_SHIFTOPCODE 



PCVC_DECODE_ERROR 
PCVC DECODE_ERROR 
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Figure 2-8 PCVC Error Logic 
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2.4.2 PCBP Error Logic 

Figure 2-9 shows a block diagram of the PCBP error logic and Table 2-2 lists the error 
signals generated in this MCA. 

Table 2-2 PCBP Errors 



Input Signal 



Fetch Errors 



Decode Errors 



XSCA_SHIFTCOUNT_H[03:00] 
XSCA_SHIFTOPCODE_H 



Intermediate Error 



DECODE_PC[07:00] 

PREFETCH_PC_H[07:00] 

INSTRUCTION_LENGTH[05:00] 

DECODE_PC[23:16] 

DECODE_PC[31:24] 

BP.PREDICTION 



PCBP_DECODE_PC_ERROR 

PCBP_PKEFETCH_PC_ERROR 

INSTRUCTION_LENGTH_ERROR 

PCBP_PCLO_DECODE_PC_ERROR 

PCBP_PCHI_DECODE_PC_ERROR 

BP PREDICTION.ERROR 



XSCA 
XSCA 



2.4.3 PCLO Error Logic 

Figure 2-10 shows a block diagram of the PCLO error logic and Table 2-3 lists the error 
signals generated in this MCA. 

Table 2-3 PCLO Errors 



Input Signal 



Fetch Errors 



DECODE_PC[23:13] 

PREFETCH_PC_H[23:13] 

DECODE_PC[05:00] 

EBOX_RESULT[15:08] 

BP_PREDICTION_PC[15:13] 

BP PREDICTION 



Decode Errors 



XSCA_SHIFTCOUNT_H[03:00] 
XSCA_SHIFTOPCODE_H 



Intermediate Error 



PCLO_DECODE_PC_ERROR 

PCLO_PREFETCH_PC_ERROR 

PCLOJPCBP.DECODEJPCJERROR 

PCLO_EBOX_RESULT_ERROR 

PCLO_PRED_PC_15_13_ERROR 

PCLO_BP_PREDICTION_ERROR 



PCLO_DECODE_ERROR 
PCLO_DECODE_ERROR 
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Figure 2-9 PCBP Error Logic 
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Figure 2-10 PCLO Error Logic 
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2.4.4 PCHI Error Logic 

Figure 2-11 shows a block diagram of the PCHI error logic and Table 2-4 lists the error 
signals generated in this MCA. 

Table 2-4 PCHI Errors 



Input Signal 



Intermediate Error 



Fetch Errors 



DECODE_PC[31:24] 
PREPETCH_PC_H[31:24] 
DECODE_PC[05:00] 
DISPLACEMENT! 11:08, 03:00] 
DISPLACEMENT[15:12, 07:04] 
BPTD_TAG_DISPLACEMENT[15:00] 
BP PREDICTION 



PCHI_DECODE_PC_ERROR 

PCHI_PREFETCH_PCERROR 

PCHI_PCBP_DECODE_PC_ERROR 

PCHI_XDTA_DISP_ERROR 

PCHI_XDTB_DISP_ERROR 

PCHI_BP_TAG_DISP_ERROR[01:00] 

PCHI BP PREDICTION_ERROR 



Decode Errors 

XSCA_SHIFTCOUNT_H[03:00] 
XSCA_SHIFTOPCODE_H 



PCHI_DECODE_ERROR 
PCHI DECODE_ERROR 



2.5 PCU Inputs 

The MBox inputs to the PCU are in response to an instruction buffer request to the 
MBox. The two signals the PCU receives are as follows: 

• MBOX_IB_RESPONSE_H directs the control of the MTAG and the prefetch PC when 
the MBox is returning data for a VIC refill. 

• MBOX_IB_PAGEFAULT_H inhibits writing page faulted data to the VIC. 

The EBox inputs to the PCU validate predicted branches and, when the IBox is flushed, 
provides a starting address. The primary EBox inputs to the PCU are as follows: 

• EBOX_RESULT_H[31:00] provides the starting address after a flush. 

• EBOX_BRANCH_A_H, when asserted, indicates a bad branch prediction. 

• EBOX_BRANCH_VALID_A_H validates EBOX_BRANCH_A_H. 

The XBAR provides shift counts and error status to the PCU. The XBAR inputs are as 
follows: 

• XSCA_SHIFTCOUNT_H[03:00] is used to increment the decode PC. 

• XDTB_DECODE_ERROR_H informs the PCU of an error in the instruction decode 
pipeline stage of the IBox. The signal sets the PCU in a hold state. 
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Figure 2-11 PCHI Error Logic 
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The OPU provides flush signals and the target PC for branch instructions. The OPU 
inputs are as follows: 

• OCTL_PC_FLUSH flushes the IBox to a new decode PC, provided by the EBox. 

• OCTL_IBUF_FLUSH holds the prefetch PC until the IBox can resume decoding an 
I-stream. The signal is sent when the IBox has to wait for one of the VIC block valids 
to be flushed. 

• OPU_TARGET_PC_H[31:00] is provided by the CSU on branch instructions. 

• OCTL_VIC_FLUSH_H flushes the VIC. 

2.6 PCU Outputs 

The PCU directs outputs to the MBox, EBox, XBAR, and OPU. The outputs direct the 
flow and execution of instructions through the functional units that receive them. 

The MBox receives a byte parity protected address that directs the prefetching of I- 
stream. These two outputs are as follows: 

• IBOX_IB_ADDRESS[31:03] is the prefetch PC. (The lower three bits are always zero.) 

• IBOX_IB_ADDRESS_PARITY[03:00] is the byte parity for the address that the PCU 
is sending the MBox. 

The EBox receives a copy of the decode PC and information pertaining to branches. The 
signals the EBox receives are as follows: 

• IBOX_PC[31:00] is a copy of the decode PC. The EBox, after execution of the 
instruction to which this PC points, stores it in the PC history buffer, or uses it 
during exception processing. 

• IBOX_PC_PARITY[03:00] is the byte parity for the IBox PC. 

• IBOX_CORRECTION_H is asserted if the IBox can correct a bad branch prediction 
before it is shifted out of the IBUF. 

• IBOX_PC_VALID_H is asserted when a new valid opcode is shifted into the opcode 
byte of the IBUF. 

• IBOX_PREDICTION_H, asserted, informs the EBox that the IBox has predicted a 
branch to be taken. 

The XBAR receives two signals from the PCU: 

• PCHI_UNWIND_H is asserted when the EBox detects a bad branch prediction. The 
signal asserts a restart signal in the three XBAR MCAs. 

• PCHI_DECODE_ERROR_H is asserted by a parity error on XSCA_SHIFTCOUNT_ 
C_H or XSCA_SHIFTOPCODE_C_H. 

The following PCU outputs provide the OPU with the decode PC and branch information: 

• DECODE_PC_H[31:00] is the CSU's copy of the decode PC. 

• PCHI_UNWIND_H is asserted on a bad branch prediction. This signal initiates a 
flush of the CSU. 

• PCHI_CORRECTION_H is input to the branch count logic of the CSU. 

• PCVC_VIC_FIP_H is sent to OCTL when the VIC is being flushed. Asserting this 
signal directs OCTL to stall the IBox if another flush is received before the first flush 
is complete. (The VIC flush requires 256 cycles.) 
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This chapter describes the functional units representing the instruction fetch pipeline 
stage: the VIC, the instruction buffer, and the MBox interface. 

3.1 VIC 

The VIC is a direct-mapped, 8-Kbyte cache with a block size of 32 bytes and a fill size of 
8 bytes. Physically, the VIC is made up of four groups of STRAMs: 

• Data STRAMs — Store one quadword of data per location. 

• Tag STRAMs — Store bits [31:13] of the address of each block in the VIC. 

• Block valid STRAMs — Store one bit to indicate that the VIC block is valid. 

• Quadword valid STRAMs — Store four bits to indicate that a corresponding 
quadword is valid. 

VIC data STRAMs are on the VIC MCU and the VIC tag STRAMs are on the XBR MCU. 
A VIC block consists of four quadwords of VIC data. There are 1024 VIC data locations 
and 256 VIC tags, VIC block valid bits, and VIC quadword valid bits. There are two 
separate block valid bits for the VIC. Figure 3-1 is a simplified block diagram of the 
VIC. 
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Figure 3-1 VIC 
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3-2 Instruction Fetch 

3.1.1 VIC Hit 

The instruction buffer determines a VIC hit (VIC_MATCHA_H or VIC_MATCHB_H 
asserted) if the following conditions are met: 

The VIC block is valid. 

The prefetch PC matches the VIC tag. 

The quadword is valid. 

If these conditions are not met, a VIC miss is signaled and an instruction buffer request 
is made to the MBox. 

3.1.2 VIC Data Write 

Control of the VIC is provided by the prefetch PC and the instruction buffer. VIC match 
and instruction buffer request logic are in the instruction buffer, while the requested 
address is provided by the prefetch PC. 

Figure 3-2 shows a block diagram of the VIC data write function. The instruction buffer 
request initiates the following sequences: 

• One cycle after the request is made, the instruction buffer asserts IBUF_REQUEST_ 
IN_PROCESS. This signal selects the memory tag (MTAG) to address the VIC. 

• MBOX_IB_RESPONSE is asserted when the MBox responds to the request. Without 
any stalls present or a page fault on the requested data, this signal provides the write 
enable for the VIC data. 
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Figure 3-2 VIC Data Write 
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3.1.2.1 VIC Address Selection 

The VIC is addressed by the prefetch PC on VIC requests and is addressed by the MTAG 
when the I-stream is being returned from the MBox. The MTAG is the address that is 
returned from the MBox in response to an instruction buffer request. 

Figure 3-3 shows the generation of the VIC address. The prefetch PC provides the 
requested address to the MBox and is then held from incrementing until the MBox 
returns the requested data. When the data is returned (MBOX_IB_RESPONSE_H 
asserted), the prefetch PC begins incrementing again. 

VIC_ADDRESS_SELECT_H is asserted to select MTAG when the MBox returns the 
requested I-stream. 

The prefetch PC does not address the VIC when data is returning from the MBox because 
the XBAR does not always consume a quadword of I-stream in a single cycle. If the 
prefetch PC did address the VIC and the XBAR did not consume a quadword of I-stream 
each cycle of the response, the prefetch PC would be incremented past the last quadword 
that the instruction buffer receives next. The prefetch PC is loaded with the MTAG at 
the end of an instruction buffer request so that it points to the next quadword that the 
instruction buffer requests. 
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Figure 3-3 VIC Address Selection 

3.1.2.2 VIC Data 

With the VIC data write signal (PCVC_VIC_DATA_WRITE_H) enabled and the VIC 
address selected, the MBox returns the I-stream and writes it to the STRAMs. The 
I-stream arrives one quadword per cycle and is byte parity protected. 

Most requests for VIC data are for four quadwords, but requests for less can be issued. 
If a request is for less than four quadwords, the MBox returns the requested quadword 
and the remaining quadwords in the block. That is, the request does not wrap around 
the block. For example, if the requested address is for the third quadword in a block, 
quadwords 2 and 3 are returned, but quadwords and 1 are not. 
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3.1.3 VIC Tag Write 

As VIC data is being written to the cache, the tags are also written. As the requested 
I-stream is returned from the MBox, the PCU writes the tag field (bits [31:13] of the 
MTAG) and validates the block and quadword valid bits for that location in the VIC. 
Figure 3-4 is a block diagram of the VIC tag write function. 

MBOX_IB_RESPONSE_H is asserted in the first cycle of a response from the MBox and 
enables writing the tag and valid fields to the VIC tag (VICT) STRAMs. 

The following sections describe the four write functions associated with the VIC tag field. 
The four fields to be written are as follows: 

VIC tag 

Quadword valid bits 
Block valid bit 
Tag parity 
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3.1.3.1 Tag Write 

When the requested I-stream is returned from the MBox, the PCU writes MTAG bits 
[31:13] to the tag field. The address of the Tequest was originally supplied by the prefetch 
PC, but when the I-stream is returning, the VIC is addressed by the MTAG. 

When the VIC is receiving multiple responses from the MBox, the address of the VIC 
must be incremented by a quadword in each cycle of the MBox response. The prefetch 
PC is not incremented by a quadword until a quadword of I-stream is consumed by the 
XBAR. Because of this, a new address (MTAG) must be used when the I-stream comes 
from the MBox to the VIC. 

To address subsequent returning quadwords, bits [04:03] of the VIC address are 
incremented each cycle. The VIC address is then loaded into the MTAG and used as 
the returning quadword address. 

3.1.3.2 Quadword Valid Write 

As each quadword returns, bits [04:03] of the MTAG are decoded to identify the quadword 
being written. The bit field is decoded and the single bit valid field is written as the data 
is written to the data field. Validating this field as the data is being written provides 
valid data, for the instruction buffer, in the first cycle of the MBox response. 

3.1.3.3 Block Valid Write 

The VIC contains two sets of the block valid field. The 2-bit field, when validated, 
indicates that valid data is in the VIC. The purpose of the two copies of the field is to 
enable a VIC flush in a single cycle. 

To validate this field, the correct block valid is selected, then the valid bit is latched into 
the STRAM. The field is validated when the first quadword, of a request, is returned from 
the MBox. 

A block valid selection and write is initiated by the assertion of MBOX_IB_RESPONSE_ 
H. The selection of the block valid is achieved by inverting USE_BLOCK_A_VALID. This 
selects the block that was not in use since the last VIC flush. 

3.1.3.4 Tag Parity 

All VIC tag parity checking is performed in IBFA. When a VIC tag parity error is detected 
in IBFA, IBFA_VIC_ERROR_H is asserted and passed to the IBFB fetch error circuitry. 
IBFB asserts IBFB_FETCH_ERROR_H and forwards it to the EBox when a VIC tag 
parity error is detected. 
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3.1.4 VIC Flush 

The VIC is flushed on every REI and, in most cases, the flush is achieved in a single 

cycle. 

A flush is initiated when OCTL_VIC_FLUSH_H is asserted. This signal is generated in 
OCTL and is the result of a flush code sent by the EBox. Receipt of this signal inverts 
the selection of the block valid being used and initiates the flush of the original block 
valid that was in use. The flush requires 256 cycles. 

Each cycle of the block valid flush, VIC_CLEAR_ADDRESS_H[08:00] is incremented. 
The flush is complete when bit 8 is set (VIC_CLEAR_ADDRESS_H[08:00] = IFF). 

Once a flush is initiated by OCTL, the PCU asserts VIC_FLUSH_IN_PROGRESS_H. 
This signal remains asserted until the flush is complete and prohibits issuing another 
flush request until it is negated. 

3.1.5 VIC Data Read 

When the instruction buffer requires a new I-stream, compare logic on the IBFA MCA 
checks the VIC tag field for a resident block and valid quadwords then compares the 
requested address with the VIC address tag field. If there is a match, a VIC hit (VIC_ 
MATCH_H) is signaled and the requested I-stream is loaded into the instruction buffer. 
Figure 3-5 is a simplified block diagram showing the VIC compare logic. 
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Figure 3-5 VIC Match Logic 
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The block valid bit is checked to detect a valid VIC block. The prefetch PC (bits [04:03]) 
selects the quadword valid to be compared. The tag address is compared with prefetch 
PC [31:13]. A match of the two address fields enables VIC_MATCH_H. 

A VIC hit asserts READ_VIC_DATA_H and the requested I-stream is written to the 
instruction buffer. 

3.1.5.1 VICSTRAM Bypass 

During a VIC refill, it is possible to write and validate the data and latch the data in the 
instruction buffer in the same cycle. 

When the MBox is returning the first quadword of an I-stream, the block and quadword 
are marked valid and the tag and data are written to the STRAMs. By placing the data 
on the output latches of the STRAMs, the instruction buffer is allowed to receive the 
valid data late in the same cycle. 

If the XBAR consumes the instruction buffer data in a single cycle and the MBox is 
simultaneously returning VIC data, subsequent quadwords are latched in the instruction 
buffer through the VIC STRAM bypass. 

3.1.6 VIC Parity Coverage 

The VIC can cause two errors: fetch errors and decode errors. The VIC tag (VICT) fields 
are parity checked on the IBFA MCA. VIC data (VICD) is parity checked on IBFA and 
IBFB. 

VIC data is byte parity protected and is checked on the output, as the data is latched into 
the instruction buffer. IBFB_FETCH_ERROR_H is asserted when this error is detected. 

Block valid parity, tag parity, and quadword valid parity is checked on IBFA. Errors 
in these units assert IBFA_VIC_ERROR_H. This signal is latched to IBFB and asserts 
IBFB_FETCH_ERROR_H. 

The SPU can disable the VIC parity error detection logic. Enabling this feature 
suppresses reporting of VIC parity errors. 

3.1.7 Disabling VIC Hits 

The SPU can disable the VIC hit logic. Enabling this feature causes every VIC access to 
result in a VIC miss. The instruction buffer is then forced to initiate an MBox request 
each time it needs an I-stream. This feature can be used as a troubleshooting aid or as a 
temporary solution to an excessive error rate in the VIC. 
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3.2 Instruction Buffer 

The instruction buffer logic is physically partitioned across the IBFA and IBFB MCAs on 

the XBR MCU, with IBFA handling the low nibble and IBFB handling the high nibble. 

The primary function of the instruction buffer is to present the I-stream to the XBAR for 

decoding. 

The instruction buffer is comprised of six major functional units (Figure 3-6): 

IBUF 

IBEX 

IBEX2 

Rotator 

Shifter 

Merger 

Under usual conditions, the instruction buffer is filled in the following manner: The 
first quadword of VIC data passes through the rotator and merger and is loaded into 
the IBUF. The second quadword is loaded into IBEX, and IBEX2 receives the third 
quadword. As the IBUF bytes are decoded, they are shifted out and replenished with 
bytes from IBEX. The rotator, shifter, and merger align the remaining IBUF bytes and 
the new IBEX bytes in sequential order in the IBUF. When the IBEX becomes empty, the 
quadword in IBEX2 is loaded into IBEX and IBEX2 requests more data from the VIC. If 
no valid data is in the VIC, then a request is passed to the MBox. 
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Figure 3-6 Instruction Buffer 
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IBEX and IBEX2 are quadword buffers between the VIC and IBUE Data can be loaded 
into IBUF from either of these registers, with valid data in IBEX taken before any valid 
data from IBEX2. Data taken from either of these sources is rotated by the rotator to 
provide the correct byte position in IBUE 

As the data in IBUF is consumed, decoded bytes are shifted out and replenished with 
new valid data from IBEX IBEX2, or the VIC. The shifter provides this function by 
shifting the decoded bytes out and shifting remaining bytes down into vacant lower byte 
positions. 

The merger ties the functions of the rotator and shifter together by replenishing IBUF 
with valid data from both the shifter and rotator. 

Control for the instruction buffer depends primarily on three signals: 

• IBUF_VALID_COUNT_H[03:00] is the number of valid bytes in the 9-byte IBUF. 

• IBEX_VALID_COUNT_H[03:00] is the number of valid bytes in the 8-byte IBEX 

• XSCA_SHIFTCOUNT_H is the number of IBUF bytes the XBAR has decoded in a 
single cycle. 

These three signals direct the flow of data through the rotator and shifter, and they 
initiate requests to the VIC and MBox for replenishment. 

The shift count (XSCA_SHIFTCOUNT_H[03:003) is the number of bytes the XBAR has 
decoded and directs the shifter to shift decoded bytes out of IBUF. 

The IBUF valid count (IBUF_VAL_H[03:00]) selects rotate and merger data for 
replenishing IBUF. This valid count is calculated by subtracting the XBAR shift count 
from the previous IBUF valid count. 

The IBEX valid count (IBEX_VALID_COUNT_H[02:00]) is the number of valid bytes 
in the 8-byte IBEX. This valid count is used by the rotator to select valid bytes to be 
loaded into IBUF. When no valid bytes are in IBEX (IBEX_VALID_COUNT_H = 0), data 
is loaded into IBUF from IBEX2, the VIC, or a request is made to the MBox. 

3.2.1 IBEX2 

IBEX2 is an 8-byte buffer between the VIC and IBEX. IBEX2 receives and outputs data 
eight bytes per cycle. IBEX2 receives input data from the VIC and outputs to either 
IBEX or IBUF. Which unit receives IBEX2 data depends on the valid counts of IBEX and 
IBUF. For example, if IBEX is empty (IBEX_VALID_COUNT_H = 0) and IBUF is full 
(IBUF_VALID_COUNT_H = 9), then the eight bytes of IBEX2 data is loaded into IBEX. 
This load is accomplished by selecting IBEX2 data at the rotator multiplexer (IBEX_ 
ROTATE_SELECT_H[01:O0] = 1). The data is not passed through the rotator because the 
IBUF valid count indicates that IBUF is full. 

Validating IBEX2 requires only a single bit, as the buffer is either full or empty. IBEX2_ 
VALID_H is asserted when valid data is in IBEX2. To load data into IBEX2 and assert 
the IBEX2 valid bit, IBEX must contain valid data (IBEX_EMPTY_TB_H is negated) and 
a VIC match must occur (VIC_MATCHA_H or VIC_MATCHB_H asserted). 
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When IBEX2 contains valid data (IBEX2_VALID_H asserted), the prefetch PC cannot be 
incremented past the next sequential quadword it is addressing. A hold signal (IBUF_ 
HOLD PREFETCH_PC) is asserted to stop incrementing the PC. This signal is asserted 
when IBEX2_VALID_H is asserted and IBEX_EMPTY_H is negated. The PC is held 
because it must continue to point to the next quadword the instruction buffer will receive. 
When IBEX2 becomes empty, the quadword that the prefetch PC is addressing is loaded 
into IBEX2 and the PC is incremented. 

3.2.2 IBEX 

IBEX, unlike IBEX2, can contain valid data in any of its byte locations. This allows IBEX 
to replenish IBUF with any number of bytes as they are decoded by XBAR. Data that is 
loaded into IBEX is sent to the rotator of the instruction buffer. IBEX data is passed to 
the rotator in a low-to-high byte order, providing IBUF with a sequential I-stream. 

IBEX receives data from either IBEX2 or the VIC. If IBEX2 is valid, then it replenishes 
IBEX. If IBEX2 is not valid, then the VIC (if valid) replenishes IBEX. 

3.2.2.1 IBEX Valid Count 

Because IBEX replenishes IBUF with as many bytes as needed, it can contain any 

number of valid bytes between eight and zero. 

The IBEX valid count (IBEX_VALID_COUNT_H[03:00]) is calculated by subtracting the 
valid count, of the previous cycle, from the number of bytes loaded into IBUF (MERGE_ 
COUNT_H[03:00]). 

When IBEX contains valid data, and the IBUF valid count (IBUF_VC_H[03:00]) is less 
than nine, IBEX data is selected at the rotator multiplexer. The IBEX data is placed 
on the input of the rotator. The rotator rotates the bytes needed to fill IBUF (ROTATE. 
SELECT_H[03:00]) and passes the rotated IBEX bytes to the merger. The merger passes 
shift data and rotate data to IBUF and produces a merge count (MERGE_COUNT_ 
H[03:00]). The previous IBEX valid count is subtracted from the merge count to produce 
a new IBEX valid count. 

When the IBEX valid count is decremented to zero, IBEX_EMPTY_H is asserted and 
directs IBEX_ROTATE_SELECT_H[01:00] to select VIC data or IBEX2 data at the IBEX 
rotator multiplexer. When either of these sources is supplying data to IBUF, and IBUF 
does not require all eight bytes, the remaining bytes are stored and validated in IBEX. 

3.2.3 Rotator 

The rotator aligns the bytes from IBEX IBEX2, or the VIC so that they are correctly 
placed in IBUF. This logic consists of a bank of multiplexers that can select any of the 
eight bytes and insert them into IBUF in the correct byte position. 

The rotator is controlled by the IBEX valid count, the IBUF valid count, and the XBAR 
shift count. The rotator receives data from the IBEX rotator multiplexer, which selects 
IBEX data, IBEX2 data, or VIC data to be input to the rotator. The source of the data 
at the IBEX rotator multiplexer is selected by IBEX_ROTATE_SELECT_H[01:00]. This 
select signal is derived from IBEX_EMPTY_H (asserted when IBEX contains no valid 
data), IBEX2_VALID_H (the single valid bit for IBEX2), and READ_VIC_DATA_H (the 
read enable signal for the VIC). 

IBEX_ROTATE_SELECT_H[01:00] selects valid data from IBEX before selecting IBEX2 
or VIC data. If the IBEX data is not valid, IBEX2 data is selected (if valid) before the 
VIC data. If neither IBEX nor IBEX2 data is valid, VIC data is selected. 
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Figure 3-7 shows the rotator supplying IBEX data to IBUF to replenish four bytes that 
have been decoded and shifted out. 

• During cycle 1, four bytes, including the opcode, are decoded and shifted out of IBUF. 
IBEX contains seven valid bytes and IBEX2 is valid. 

• During cycle 2, IBUF bytes 4 through 9 are shifted into bytes through 4, and IBEX 
bytes 1 through 4 are rotated into the empty IBUF bytes. IBUF is full, IBEX contains 
three valid bytes, and IBEX2 is valid. 
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Figure 3-7 Rotator 
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3.2.3.1 IBEX2 Rotate Data 

When IBEX is empty, IBEX2 (if valid) replenishes IBUF with an I-stream. Figure 3-8 
shows the loading of IBEX2 data into IBUF. The quadword that IBEX2 contained is 
partially loaded into both IBUF and IBEX. In Figure 3-8, the initial IBUF valid count is 
nine, the IBEX valid count is zero, and IBEX2 is valid. 

• During cycle 1, seven bytes of IBUF data are decoded and shifted out of IBUF. The 
new IBUF valid count is two. 

• During cycle 2, IBEX2 data is selected at the rotator multiplexer. Seven bytes of 
IBEX2 data are loaded into IBUF; one byte is loaded into IBEX The IBEX2 valid 
bit (IBEX2_VALID H) is negated, the IBEX valid count (IBEX_VALID_COUNT_ 
H[03:00]) is one, and the IBUF valid count (IBUF_VC_H[03:00]) is nine. 

When the IBEX2 valid count is negated, a read request is sent to the VIC and the 
prefetch PC is incremented (IBFA_HOLD_PREFETCH_PC_H is negated). 
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3.2.4 Merger 

The merger consists of nine, 8-bit, 2-to-l multiplexers where each multiplexer replenishes 
one of the nine bytes of the IBUF. Either rotate data or shift data can be selected as the 
input, with IBUF as the destination of the output. 

Figure 3-9 shows a simplified block diagram of the merger. Each merger multiplexer 
receives SHIFT_DATA_H and ROTATE_DATA_H for inputs. MERGE_SELECT_H[08:00] 
selects each byte to be loaded into IBUF. 
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Figure 3-9 Simplified Merger 
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Merge select is calculated by subtracting the XBAR shift count from the IBUF valid 
count. The select logic outputs a 9-bit field (MERGE_SELECT[08:00]) ; with each bit 
controlling one of the multiplexers of the merger. A logical one, in the field, selects rotate 
data while a logical zero selects shift data. 

Figure 3-10 shows the merger supplying IBUF with data from both the shifter and the 
rotator. With a XBAR shift count of 4, bytes 1 through 4 are shifted out of IBUF and 
bytes (opcode) and 5 through 8 are inputs to the merger from the shifter. The merge 
select logic receives the XBAR shift count and the IBUF valid count, and outputs the 9- 
bit field that selects shifter bytes 1 through 4 and rotator bytes 5 through 8. The opcode 
byte, because it is not shifted, is recirculated through the shifter and merger until the 
instruction is completely decoded. 

Table 3-1 is a sample of the merge select output. All outputs are based on an initial 
IBUF valid count of nine with different shift counts. 



06 07 06 05 



04 03 02 01 00 



1 1 1— "— T 

INSTRUCTION BUFFER 



I ' I ' I ' I ' r 



ROTATE DATA 



XBAR_SHIFTCOUNTI03 00) - 4 



SHIFT DATA 



IBUF_VALID_COUNT|03 00] - 9 



MERGE 
SELECT 



MERGE.SE 



LECT[08:00] - 1E0_/ \ 



SHIFT DATA 



L 



IBEX2 DATA 



A 



MR_X006B_B9 



Figure 3-10 Merger 



Table 3-1 Sample Merge Select 
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3.2.5 Shifter 

The instruction buffer shifter is responsible for shifting out decoded bytes, holding the 
opcode byte (byte 0), and realigning the remaining valid bytes in IBUF. XBAR provides 
three signals that control the shifter: 

• XSCA_SHIFTCOUNT[03:00] provides the number of IBUF bytes decoded in the last 
cycle. 

• XSCA_SHIFTOPCODE shifts the opcode out of byte 0. The signal is received when 
all of the specifier bytes of an instruction have been decoded. 

• XSCA_FD_SHIFTOPCODE is asserted when the XBAR detects an FD (extended 
opcode) in the opcode byte of IBUF. The FD byte is shifted out of byte and the 
second byte of the opcode is shifted into byte 0. 

NOTE 

Figure 3-11 provides examples of the state of IBUF after the completion of three 
cycles. For simplification, no new I-stream is placed in IBUF at die completion 
of each cycle. 

• Cycle 1 — XBAR decodes one byte and produces a shift count of one. Byte 1 is 
shifted out of IBUF and the remaining bytes are shifted down to replace it. 

• Cycle 2 — XBAR decodes the two remaining bytes of the instruction, asserts XSCA_ 
SHIFTOPCODE_H, and produces a total shift count of three. The opcode and the two 
remaining bytes of the instruction are shifted out, with the remaining bytes in IBUF 
again being shifted down to replenish those shifted out. 

• Cycle 3 — XBAR decodes the FD in the opcode byte of IBUF, asserts XSCA_FD_ 
SHIFTOPCODE_H, and produces a total shift count of one. The FD in the opcode 
byte is shifted out and the remaining bytes are shifted down one byte. 
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Figure 3-11 IBUF Data 
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3.2.6 IBUF 

IBUF holds nine bytes of I-stream that are passed to XBAR for decoding. Byte always 
contains the opcode of the instruction that is being decoded. A copy of this opcode is sent 
to the EBox, OPU, XBAR, and the branch prediction logic. 

Specifier bytes are parsed by XBAR and handled by the XBAR decode units while the 
opcode bytes remain in IBUF until the instruction has been completely decoded. 

3.2.6.1 Simple Decode 

As the I-stream is presented to XBAR for decoding, logic in the instruction buffer 
performs a small amount of decode. The decoded I-stream provides branch information 
for the PCU and CSU. Register, short literal, and YREG information is also decoded and 
passed to XBAR. Figure 3-12 shows a block diagram of the instruction buffer decode 
units. 
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3.2.6.2 Branch Decode 

Each time a new opcode is loaded into IBUF, the opcode is decoded to detect a branch 
instruction. The decoded branch opcode has two destinations, the CSU and the PCU. 
The CSU receives a single signal (IBFB_BRANCH_INSTRUCTION_H) that indicates a 
branch instruction is currently being decoded. 

The PCU receives more detailed information relevant to the branch. The four signals 
provide information to aid in directing the PCU and branch prediction cache (BPC) 
handling. The four signals provided by the branch decode logic are as follows: 

• IBFB_BRANCH_OPCODE_H is asserted if the opcode is a branch instruction. 

• IBFB_UNCONDITIONAL_H is asserted if the opcode is an unconditional branch. 

• IBFB LOOP_BRANCH_H is asserted for any loop branch instruction, except for 
an emulated branch. (Loop branches are AOBLEQ, AOBLSS, SOBGEQ, SOBGTR, 
ACBL, ACBW, and ACBB.) 

• IBFB_CACHEABLE_H informs the PCU that the branch opcode in IBUF can be 
cached in the BPC. (Noncacheable branches are RSB, JSB, JMP, CALLG, and CALLS. 
These instructions do not have normal branch displacement specifiers.) 

3.2.6.3 YREG Decode 

The low nibble of bytes 1, 2, 3, and 4 are decoded to provide YREG information for XBAR. 
When all four bits are set, PC addressing is possible. 

3.2.6.4 Short Literal Decode 

Bytes 5, 6, and 7 are decoded to detect short literal specifiers. If the upper two bits of 
bytes 5, 6, and 7 are zeros, a short literal specifier is detected. Each bit in the signal 
corresponds to the byte number in the I-stream of IBUF. 

3.2.6.5 Register Mode Decode 

Bytes 5 through 8 are decoded to a value of 5. The high nibble of the IBUF I-stream is 
input to the decoder, producing an output with each bit in the signal corresponding to the 
byte being decoded (for example, if IBFB_REGISTER_MODE_H[08:05] = 3, then bytes 5 
and 6 may contain register specifiers). 

3.2.7 Instruction Buffer Parity 

Instruction buffer data is byte parity protected and is checked at two locations. The 
outputs of IBEX and IBUF contain parity detection circuitry. 

As data is output from IBEX, a parity check is performed. IBFB receives partial parity 
from IBFA and performs the check. Detected errors assert IBFB_IBEX_ERROR_H, 
which asserts IBFB_FETCH_ERROR_H and is forwarded to the EBox as IBOX_FORK_ 
ERROR.H. 

The data that is being presented to the XBAR is checked on the output to the instruction 
buffer. IBFB receives partial parity from IBFA and performs the check. Detected errors 
assert IBUF_ERROR_H, which is latched to the XBAR as DECODE_ERROR_H. This 
error is forwarded to the EBox as IBOX_POINTER_ERROR_H. 
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3.3 Instruction Buffer Interface 

The instruction buffer interface is a read-only port to the MBox. Requests are made 
across this port for the I-stream to be loaded into the VIC. Most requests are for a VIC 
block (four aligned quadwords). 

Figure 3-13 summarizes the instruction buffer interface to the MBox. Table 3-2 
describes each signal line. 
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Figure 3-13 Instruction Buffer Interface 
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Table 3-2 Instruction Buffer Interface Signals 



Name 



IBOX_IB_REQUEST_H 
IBOX_IB_ADDRESS_H[31:03] 

IBOX_IB_ADDRESS_PARITY_H[03:00] 

IBOX_IB_ABORT_H 

ffiOX_ABORT_H 

MBOX_IB_DATA_H[63:00] 
MBOX_IB_DATA_PARITY_H[07:00] 

MBOX_IB_RESPONSE 
MBOX IB_PAGE_FAULT_H, L 



Description 



The request for data from the MBox. Detection of a 
VIC miss or flush of the VIC asserts this line. 

Address lines for the requested quadword. The 
address is generated by the PCU and, because all 
requests are quadword aligned, the lower three bits 
of the address are assumed to be zero. 

Parity protection for the address that the Box 
sends. 

Asserted to abort a request. The signal aborts a 
request only if it is in the TB stage of the MBox. 

Overriding abort signal for the interface. When 
asserted, it asserts IBOX_IB_ABORT. An EBox flush 
or branch prediction unwind asserts this signal. 
This signal also aborts OPU port transactions. 

Returning quadwords requested by the IBox. 

Byte parity for the quadword being returned by the 
MBox. 

Informs the IBox that data will be returned in the 
next cycle. The signal is negated late in the cycle 
that the last quadword is being returned. 

Asserted to inform the IBox that the requested 
data has page faulted in the MBox. The instruction 
buffer and the XBAR receive this signal and inform 
the EBox of the page fault if decode of the data is 
attempted. 
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3.3.1 Instruction Buffer Requests 

Figure 3-14 shows an instruction buffer request that is honored, with three quadwords 
written to the VIC. 

Usually, IBOX_IB_ADDRESS_H[31:03], IBOX_IB_REQUEST_H, and IBOX_IB_ABORT_ 
H are asserted. In response to these signals, the MBox latches the address, and detects 
and aborts the request each cycle. 

A request is initiated by negating the abort signal (IBOX_IB_ABORT_H) early in the 
cycle after the address has been latched in the MBox. This signal is negated when 
REQUEST_CONDITION_H is asserted by IBEX_EMPTY_H asserted, and IBEX2_ 
VALID.H, VIC_MATCHA_H, and VIC_MATCHB_H are all negated. The MBox detects 
the request early in this same cycle and the IBox lowers the request signal. 

The request is detected in the MBox translation buffer (TB) stage. In this stage, the 
MBox latches the request and address, and arbitrates for the TB. When control of the TB 
is received, a TB lookup and a validation of the translation is performed. Figure 3-14 
shows the request in the TB stage for a single cycle. Figure 3-14 is a best case example 
because TB arbitration could cause several cycles of delay if the MBox was honoring other 
requests. The instruction buffer port has the lowest priority of all ports that arbitrate for 
the TB. 

Cache arbitration results in a cache hit or miss. A cache hit asserts MBOX_IB_ 
RESPONSE_H, late in the cycle. In the following cycle, the MBox returns the requested 
quadword (MBOX_IB_DATA[63:00]) and subsequent quadwords to the end of the block. 

Late in the same cycle that the last quadword is being returned, the MBox response 
signal is negated. In the same cycle, the abort signal then the request signal are asserted 
again. 
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Figure 3-14 Instruction Buffer Request 
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3.3.1.1 Aborting Requests 

Two abort signals are associated with the instruction buffer interface: 

IBOX IB_ABORT_H 
IBOX_ABORT_H 

IBOX_IB_ABORT_H, when asserted, attempts to abort the request. This signal is not 
asserted for a request that has entered the cache stage because of the penalties the 
MBox must pay to clean up. The MBox returns the first quadword and then aborts. 
The quadword is not written to the VIC because IBOX_IB_ABORT_H, asserted, negates 
VIC_DATA_WRITE_H. 

IBOX_ABORT, when asserted, unconditionally aborts instruction buffer requests. This 
signal is asserted by an EBox flush (EBOX_FLUSH_H[02:00]) due to an error, interrupt, 
or branch unwind situation. 

3.3.1 .2 Page Faults 

When a page fault is detected, the MBox asserts MBOX_IB_RESPONSE_H and MBOX_ 
IB_PAGE_FAULT_H in the same cycle. The page fault signal is sent to the instruction 
buffer and the XBAR. The XBAR notifies the EBox of page faults if decode of the data is 
attempted. If the VIC is flushed, the data is not accessed and the MBox clears the page 
fault and associated registers. 
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4 

Instruction Decode 



This chapter describes the two functional units representing the instruction decode 
pipeline stage: the XBAR and branch prediction logic. 

4.1 XBAR 

The XBAR decodes the individual microinstructions and determines the following: 

Number of instruction specifiers 
Destination of each decoded specifier 
Number of specifiers decoded in a single cycle 

Each cycle, the XBAR attempts to decode and pass specifier data to the specifier handling 
units of the IBox: SLU, CSU, and FPL. Because each unit processes a unique specifier 
type, the number of combinations of specifiers the XBAR can successfully decode each 
cycle is restricted. The XBAR can decode up to three specifiers in a single cycle. The 
specifiers may be all register mode, two register mode and one short literal or complex, 
or one of each (register, complex, or short literal) in any order. Complex specifiers are 
branch displacements and all specifiers other than short literal mode and register mode 
specifiers. The XBAR concurrently processes specifiers of a single instruction only. 

The XBAR is also responsible for generating read and write register masks for conflict 
checking by the CSU. In special cases where conflicts occur in a single instruction, the 
XBAR redirects register specifiers to the CSU for processing. 

Figure 4-1 is a block diagram of the XBAR. As shown in Figure 4-1, the XBAR receives 
72 bits of instruction buffer data and distributes it throughout the associated MCAs 
(XSCA, XDTA, and XDTB). The following list introduces the major XBAR functional 
blocks. 
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Figure 4-1 XBAR Block Diagram 



DIGITAL INTERNAL USE ONLY 



Instruction Decode 4-3 



DRAM and XRAM — The DRAM and XRAM decode the opcode and produce the 
following: 

— Specifier count — The total number of specifiers the instruction contains. 

— Specifier data type — The data type of each specifier in the instruction (byte, 
word, longword, etc.). 

— Specifier access type — The access type of each specifier in the instruction 
(read, write, modify, etc.). 

The XRAM specifier count output controls the instruction buffer shifter by providing 
a running total of specifiers decoded each cycle and by informing the instruction 
buffer to load a new opcode into the opcode byte (byte 0) of the IBUF when all of the 
specifiers of an instruction have been decoded by the XBAR. 

The specifier attribute outputs (data type and access type) are distributed throughout 
the XBAR to the individual data path units (displacement, short literal, and source 
and destination logic units) to validate outputs to the specifier handlers. These 
outputs are also used to generate a read and write mask and to detect intra- 
instruction read conflicts (IRC). 

Simple decode logic — Simple decode logic decodes the I-stream and provides the 
addressing mode (register mode, absolute mode, and so on) for the first four bytes of 
the instruction. The addressing mode outputs are used to validate the data path unit 
outputs and by the decode tree logic and request logic when generating their shift 
count and specifier count outputs. 

Decode tree and request logic — Each cycle, the decode tree logic and request 
logic perform a parallel operation that determines the number of specifiers the XBAR 
will decode and the number of specifier bytes that will be decoded. 

The decode trees produce shift count and specifiers decoded outputs based on the 
decoded addressing modes and data types and by decoding the I-stream. There are 
14 pairs of output (one shift count and one number of specifiers decoded) generated 
each decode cycle. 

Request logic decodes addressing modes, specifier access types, and the specifier 
count. It produces an output that selects the correct decode tree output for the 
instruction currently being decoded. 

For example, for the instruction ADDL2 RO, Rl, the request logic determines that 
there are two specifiers, the access types are read and modify, and there are two 
register specifiers in the instruction. The request logic selects R2 (R2_SC_H[02:00] 
shift count and R2_N_H[001:00] specifiers decoded count) from the request logic. The 
R2 output provides the correct shift count to the instruction buffer and the correct 
number of specifiers decoded to the specifier count logic. 

The decode tree logic has produced 14 outputs for the ADDL2 RO, Rl with only one 
of the outputs producing the correct specifiers decoded and the correct shift count. 
The R2 output is produced by logic that produces only an output that is based on 
an instruction that contains two specifiers. In the same cycle that the R2 output 
is selected, outputs based on different numbers and structures of specifiers are 
produced. If the R3BW output were selected, the shift count and number of specifiers 
decoded would be equal to those based on decoding three specifiers, with the third 
specifier being a branch word displacement. 
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The parallel operation of the decode tree logic and the request logic is implemented to 
reduce the cycle time required to decode the specifiers of an instruction (to determine 
their addressing modes, access types, and data types) and then calculate the number 
of bytes that will be decoded and the number of specifiers that can be decoded. 

• Displacement logic — The displacement logic receives the complex specifiers (other 
than register and short literal specifiers) and, when a specifier is valid, passes up to 
32 bits of displacement to the CSU of the OPU MCU. 

• Short literal logic — The short literal unit decodes the short literal specifiers 
and passes six bits of short literal data to the SL specifier handler in the OPU for 
expansion. 

• Source and destination logic — The source and destination logic outputs source 
1 pointers, source 2 pointers, destination pointers, and a read and write register 
field to the OPU specifier handlers and to the read/write mask logic. The source 
and destination pointers define the operand addresses (register number or memory 
location) of the specifiers. The read/write masks record the reading and writing of 
registers by the EBox during the execution of an instruction. 

• IRC logic — The IRC logic detects intra-instruction read conflicts. These conflicts 
occur when there are read conflicts in the specifiers of an instruction. That is, an 
instruction specifier directs the EBox to read RO and a subsequent specifier directs 
the IBox to autoincrement or autodecrement RO. When IRCs occur, the IBox decodes 
each specifier separately and does not update the EBox GPRs until instruction 
execution is completed by the EBox. 

4.1.1 DRAM 

The XBAR DRAM is implemented as a functional logic block in the XSCA MCA. That is, 
the DRAM is comprised of logic gates instead of a RAM structure, as with previous VAX 
systems. 

The inputs to the DRAM logic are the instruction opcode, the extended opcode bit, and the 
specifiers remaining in the instruction currently being decoded. Based on theses inputs, 
the number of specifiers the instruction contains and the specifier attributes (access type 
and data type of each specifier) are determined. The outputs of the DRAM are passed 
to the other functional units of the XBAR and are used to validate the specifier handler 
outputs and also are used in the specifier count and shift count logic. Figure 4-2 shows a 
block diagram of the DRAM logic. 
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Figure 4-2 DRAM Logic 
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DRAM data type logic decodes the opcode (I_DATA_H[07:00]) and the extended opcode 

bit (EXTENDED.H), and produces SP1_DATA_TYPE_H[02:00] through SP6_DATA_ 

TYPE_H[02:00]. Each output describes the data type of the specifier it represents. (For 

example, SP1_DATA_TYPE_H[02:00] describes the data type of the first specifier of the 

instruction being decoded.) 

Each cycle SPECIFIERS_COMPLETED_H[02:00] selects up to three of the data type 

outputs to be used by the XBAR decode units. SPECIFIERS_COMPLETED_H[02:00] is 

the number of specifiers that have been decoded in the instruction that is currently being 

decoded. 

DRAM access type outputs are also based on the decode of the instruction opcode and the 

extended opcode bit. The access type of each specifier (for example, SPEC1_READ_H and 

SPEC1_WRITE_H) is placed on the DRAM output multiplexers and is also selected by 

SPECIFIERS_COMPLETED_H[02:00]. 

4.1.2 XRAM 

The XRAM is physically structured like the DRAM and is contained on XDTB. The 
XRAM decodes the instruction opcodes and the extended bit, and produces five fields: 

• SPECIFIER_COUNT_H[02:00] is the total number of specifiers the instruction 
contains. 

• IMPLIED_MASK_H is asserted when a character string instruction is detected. 

• VSRC_FORK_MODIFY_H is asserted when a variable length bit field instruction is 
decoded. 

• SUSPEND_H is asserted because the instruction that is currently being decoded 
does not leave predictable results in memory or registers. XBAR continues to 
decode specifiers until it encounters a complex specifier and then suspends (stops 
all decoding) and waits for the EBox to restart it (EBOX_UNSUSPEND_H). Some of 
the instructions that assert SUSPEND_H are as follows: 

ADAWI (add aligned word interlocked) 
ADDP4 (add packed 4-operand) 
EDITPC (edit packed to character string) 

• STOP_H is similar to SUSPEND_H. STOP.H is asserted by certain instructions 
after all of the specifiers of the instruction have been decoded. When STOP_H is 
asserted, the XBAR does not process any specifiers until the EBox asserts EBOX_ 
UNSUSPEND_H. Some instruction that assert STOP_H are as follows: 

HALT 

CASEx (case byte, word, longword) 

CHMx (change mode) 
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4.1.3 Simple Decode Logic 

The simple decode logic receives the high nibble of the first four bytes of instruction 
buffer data and I_YREG_F[04:01] (used to determine which bytes are relative addressing 
mode or immediate addressing mode) from the instruction buffer simple decode logic. 
Outputs defining the addressing modes of bytes 1 through 4 of the I-stream are produced 
from these inputs. Figure 4-3 shows the inputs and outputs of the simple decode logic. 

The simple decode logic also decodes the I-stream to produce CASE_H[01:00]. This 2-bit 
signal defines where in the I-stream the complex specifier is located. CASE_H[01:00] is 
valid when any addressing mode other than short literal or register is detected. Table 4-1 
describes the four case outputs. 

CASE_H[01:00] is input to the decode tree logic to produce shift counts for complex 
specifiers. Because CASE_H[01:00] determines only the location of the complex specifier, 
it must be decoded with the addressing mode to produce the length of the specifier for the 
shift count logic. 
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Figure 4-3 Simple Decode Logic 
Table 4-1 Case Output 



CASE_H[01:00] Byte Position of Complex Specifier 



00 


3 


01 


2 


10 


4 


11 


1 
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4.1.4 Decode Tree Logic 

Each cycle, the decode tree logic decodes the I-stream and produces 14 unique outputs. 
One of the outputs is selected, at the end of a cycle, to produce a shift count and the 
number of specifiers decoded. Figure 4-4 shows the inputs and outputs of the decode tree 
logic. 
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Figure 4-4 XBAR Decode Trees 
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Each of the 14 outputs contains 2 fields. The two fields determine the number of 
specifiers decoded and a shift count for the XBAR: 

• Rx_SC_H[02:00] is the shift count. The shift count is the number of IBUF bytes that 
will be decoded when this tree output is selected. 

• Rx_N_H[01:00] is the specifier count. This count represents the number of specifiers 
that is decoded when this tree output is selected. 

The decode tree logic receives the following inputs: 

• I-stream — The high nibble of bytes 1 through 4 (I_DATA_H[38:37, 31:29, 23:21, 
15:13]) and their valid bits (I_VALID_H[04:01]). The valid signals are received from 
the IBUF valid count. 

• Addressing modes — The addressing mode logic decodes the I-stream and supplies 
the addressing mode. 

• Case — Indicates the location of the complex specifier in the I-stream. 

• Data type — The DRAM decodes the opcode and outputs the specifier data types to 
the tree logic. 

• IRC — IRC decode logic passes this signal when an IRC is detected in the I-stream 
currently being decoded. 

• Stall — If the CSU or SLU is stalled, the decode tree logic is notified. The stall signal 
from either of these two units (OPU_STALL_H and SL_BUSY_STALL) influences 
decode tree output. If the SLU is stalled, the decode tree cannot produce an output 
that selects data to be sent to the SLU of the OPU. 

Figure 4-5 shows a block diagram of the R2BW decode tree. This example of the tree 
logic is used because it is one of the less complex. 
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Figure 4-5 R2BW Decode Tree 
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The outputs of the R2BW tree logic are based on an I-stream that contains two valid 
specifiers, with the second specifier being a word displacement of a branch instruction. 

As shown in Figure 4-5, 1_VALID_H[03] (IBUF valid bit for byte 3) selects R2BW_N_ 
H[00] from I_VALID_L[01] or OPU STALL.L (the stall signal for the CSU). The valid 
bits for IBUF bytes 1 and 3 of the IBUF are ORed to produce R2BW_SC_H[00], and the 
valid bit for IBUF byte 3 and the CSU stall signal produce R2BW_SC_H[01] and R2BW_ 
N_H[01]. 

This tree logic output would be selected if the XBAR is decoding an ACBW Rl, R2, R3, 
displacement word and the first two specifier bytes of the instruction had been decoded in 
a previous cycle. The output would be valid because the XBAR is decoding two specifiers, 
with the second specifier being a branch word displacement. The output for this decode 
cycle of the instruction would be R2BW_SC_H = 2 and R2BW_N_H = 2. 

Table 4-2 describes the I-stream addressed by each decode tree. 
Table 4-2 XBAR Decode Trees 



Name 



Function 



RO 

Rl 

R1BB 

R1BW 

R1I 

R2 

R2BB 

R2BW 

R2I 

R2R 

R3 

R3BB 

R3BW 

R3XR 



No specifiers are decoded. 

Decodes one specifier. 

Decodes one branch byte specifier. 

Decodes one branch word specifier. 

Decodes one implied specifier. 

Decodes two specifiers. 

Decodes two specifiers; the first is a register or short literal specifier and the 
second is a branch byte displacement. 

Decodes two specifiers; the first is a register or short literal specifier and the 
second is a branch word displacement. 

Decodes two specifiers; the second is implied. 

Decodes two specifiers; the second is a register specifier. 

Decodes three nonconflicting specifiers. 

Decodes three specifiers; the third is a branch byte displacement. 

Decodes three specifiers; the third is a branch word displacement. 

Decodes three specifiers but not in the same cycle. 



4.1.5 Request Logic 

The request logic decodes the specifier addressing modes and access types, and it outputs 
the 4-bit field (REQUEST_H[03:00]) that selects 1 of the 14 decode tree outputs (Rx_ 
SC_H[02:00] and Rx_N_H[01:00]). Figure 4-6 shows the relationship of these units and 
shows the inputs and outputs of the request logic. 
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Figure 4-6 Request Logic 
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The request logic uses SPECIFIERS_NEEDED_H[01:00] to determine the number of 
specifiers to decode. (For example, if only one specifier is to be decoded, the request logic 
selects only one of the Rl trees.) The addressing mode logic provides four inputs: 

Register mode 
Short literal mode 
Index mode 
Complex 

These inputs determine the number of specifiers that can be decoded and passed to the 
specifier handlers. For example, for ADDL3 Rl #43 R5, the request logic and decode tree 
logic perform their parallel operations as follows: 

• Request logic — Receives SPECIFIERS_NEEDED_H[01:00] = 3 (the number of 
specifiers in the instruction) from the XRAM and specifier count logic. The addressing 
modes of the three specifiers in the instruction are provided by the simple decode 
logic. (REGISTER_MODE_H[03:01] = 5, specifiers 1 and 3 are register specifiers 
and SL_MODE_H[02:01] = 2, specifier 2 is a SL specifier). The access type of the 
specifiers is also input to the request logic. These inputs, supplied by the DRAM, 
would be READ_H[03:01] = Oil and WRITE_H[03:02] = 10, signifying read, read, 
write as the order of access in the instruction. 

Specifier data types (SP1_DATATYPE_H[02:00] and SP2_DATATYPE_H[02:00] = 
longword) are also input to the request logic. This signal is also from the DRAM. 

The request logic output (REQUEST_H[03:00]) produces an output based on the 
above inputs that selects the R3_SC_H[03:00] and R3_N_H[01:00]. 

• Tree logic functions — In parallel with the request logic functions, the decode trees 
decode the I-stream, the addressing modes, and data types of the specifiers and stall 
signals from the specifier handlers of the OPU to produce shift counts and specifier 
decode counts for the instruction. 

The tree logic (specifically the R3 tree) produces an output that produces shift counts 
(R3_SC_H[02:00]) and the number of specifiers decoded (R3_N_H[01:00]) for an 
instruction that contains three specifiers. 

The tree logic produces the R3_N_H[01:00] outputs first and uses them as a basis for 
generating R3_SC_H[03:00]. 

To generate R3_N_H[01:00], the logic inputs the valid counts for the instruction 
buffer data that is being decoded (I_VALID_H[08:00]). 
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4.1.6 Specifier Count Logic 

Specifier counts are produced every cycle to determine the initial number of specifiers 
an instruction contains (SPECIFIER_COUNT_H[02:00]), the number decoded in a 
cycle (SPECIFIERS_DECODED_H[02:00]), and the number that remain to be decoded 
(SPECIFIERS_REMAINING_H[02:00]). Figure 4-7 shows the specifier count logic. 

Initially, SPECIFIER_COUNT_H[02:00] is produced by the XRAM and input to the 
specifier count logic. The XRAM generates the specifier count based on the opcode of the 
instruction. 

The decoded specifier output of the decode tree logic (N_L[01:00]) is subtracted from the 
specifier count to produce SPECIFIERS_REMAINING_H[02:00]. 

• SPECIFIERS_NEEDED_H[01:00] is the number of specifiers needed in the current 
decode cycle. This signal is decoded so that it represents only a maximum of three 
because the maximum number of specifiers decoded in a single cycle is three. This 
signal is loaded into an adder and a comparator. 

• XSCA_SPECIFIERS_DECODED_H[01:00] is the latched value of N_L[01:00] and is 
loaded into an adder with XDTB_SPECIFIERS_REMAINING_H[02:00] to produce 
XSCA_SPECIFIERS_REMAINING_H[02:00]. 

• The comparator receives N_L[01:00] and XSCA_SPECIFIERS_REMAINING_ 
H[02:00], and outputs XSCA_ALL_SPECIFIERS_COMPLETED_H when they are 
equal. 
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Figure 4-7 Specifier Count Logic 
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4.1.7 Shift Count Logic 

Each cycle, the XBAR passes a shift count (XSCA_SHIFTCOUNT_H[03:00]) to the 
instruction buffer. The shift count directs shifting of decoded bytes out of the instruction 
buffer and replenishment with a new I-stream. 

Three signals provide control to the instruction buffer shifter: 

XSCA_SHIFTCOUNT_H[03:00] 

XSCA_FD_SHIFTOPCODE_H 

XSCA_SHIFTOPCODE_H 

Figure 4-8 shows the generation of these three signals. 

XSCA_SPECIFIER_COUNT_H[03:00] is generated from the decode tree logic SC_ 
H[02:00] and is passed as the value to be loaded into the instruction buffer shifter. 
When an instruction is completely decoded, XSCA_SHIFTOPCODE_H is asserted and 
selects the incremented SC_H[02:00] to be sent to the shifter. 

4.1.7.1 FD Shift Opcode 

XSCA decodes the FD opcodes and asserts XSCA_FD_DETECTED_H. This signal is 
passed to the shift count logic, which asserts XSCA_FD_SHIFTOPCODE_H and directs 
the instruction buffer to shift out the FD opcode. Shifting the FD out of the opcode byte 
results in a shift count of one, as no specifiers are decoded until the FD is shifted. 
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Figure 4-8 Shift Counts 
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4.1.8 Fork Logic 

The EBox receives a fork address for each instruction that is decoded by the XBAR. The 
fork address is supplied by the XBAR and the instruction buffer. 

The instruction buffer sends two fork signals: 

IBOX_FORK_ADDRESS_H[07:00] 
IBOX_FORK_ADDRESS_PARITY_H 

The fork address is a copy of the opcode. 

The XBAR outputs three fork signals: 

• IBOX_FORK_VALID_H is the valid signal for the fork address. 

• IBOX_FORK_ADDRESS_H[08] is asserted when an FD opcode is decoded. 

• IBOX_REGISTER_FORK_H distinguishes between a register and a memory reference 
(asserted = register) when a VSRC specifier is decoded. 

When a VSRC specifier is to be decoded, the fork is not validated until the determination 
between memory or register reference is made. 

4.1.9 XBAR Displacement Data Path 

Up to 32 bits of displacement can be passed by the XBAR to the CSU in a single cycle. 
Figure 4-9 shows the logic that decodes the complex specifiers and passes the related 
displacement to the CSU. 

The XBAR displacement logic outputs four fields to the CSU: 

• DISPLACEMENTS [3 1:00] is 32 bits of branch displacement or complex specifier 
data. 

• XREG_H[03:00] indicates the GPR of index register for indexed operand specifiers. 

• YREG_H[03:00] indicates the GPR of base register for the operand specifier being 
delivered. 

• XDTB_INDEXED_H indicates the specifier under decode is indexed mode. 

4.1.9.1 Displacement 

DISPLACEMENT_H[31:00] is provided by XDTA and XDTB. The 32-bit field is nibble 
sliced as follows: 

XDTA_DISPLACEMENT_H[27:24, 19:16, 11:08, 03:00] 
XDTB_DISPLACEMENT_H[31:28, 23:20, 15:12, 07:04] 

The displacement field is selected from the instruction buffer data by decoding: 

XSCA_REQUEST_Hf03:00] 

CASE_H[01:00] 

XSCA_X8F_H 

Decoding these three fields determines where in the I-stream the complex specifier is 
located and if the specifier is an extended immediate mode specifier (XSCA_X8F_H). 
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Figure 4-9 XBAR Displacement 

4.1.9.2 Extended Immediate Mode (X8F) Detection 

Most complex specifiers are decoded by the crossbar and passed to the CSU in a single 
cycle across the 32-bit XBAR to OPU data path. Extended immediate mode specifiers are 
of data types longer than 32 bits and require more than a single cycle to be decoded and 
passed to the OPU. Because of the size of these specifiers, special handling by the XBAR 
is required to process them. 

The special handling of these specifiers involves manipulating the shift counts and 
specifier counts produced by the XBAR. 

XSCA contains logic that detects extended immediate mode specifiers. This logic 
decodes the specifiers' data types (SP1_DATATYPE_H[02:01] and so on) from the DRAM, 
IMMEDIATE_MODE_H[04:01], INDEX_MODE_H[02:01], and CASE_H[01:00] (all from 
the simple decode logic). When extended immediate mode is detected, X8F_H, X8F_ 
INHIBIT_SHIFTOPCODE_H, and X8F_SC_H[02] are asserted. 

Asserting these X8F signals forces N_H[01:00] (from the decode trees and signifying 
the number of specifiers decoded) to equal 1 and also forces SC_H[02:00] (also from the 
decode trees and signifying the number of bytes to shift out of the instruction buffer) 
to equal 4. This scenario allows the XBAR to pass the decoded specifier to the OPU 
in multiple cycles without incorrectly affecting the specifier count logic and without 
asserting SHIFTOPCODE_H before the specifier is completely decoded. 
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4.1.9.3 XREG 

When an indexed specifier is decoded, XREG_H[03:00] is asserted and sent with 
XDTB_INDEXED_H to the CSU. XREG_H[03:00] identifies the index register. XDTB_ 
INDEXEDJH is generated by decoding the following: 

• IRC_REGISTER_MODE is asserted when the IRC is detected and the specifier is 
register mode. 

• IMPLIED_H is asserted by decoding the request logic field. 

• CASE_H[01:00] identifies the location of the complex specifier in the I-stream. 

• INDEX_MODE_H[02:00], from the addressing mode logic, defines which specifier, if 
any, is decoded as an index specifier. 

4.1.9.4 YREG 

YREG_H[03:00] is generated in XDTA and sent to OSQA (CSU) to indicate which GPR 
the CSU references for an operand address calculation. Byte 1, 2, 3, or 4 of instruction 
buffer data is selected by CASE_H[01:00] to produce the YREG output. 

The YREG generation logic also contains inputs from the request logic, DRAM, and 
IRC detection logic. When an IRC is detected, this logic supplies the GPR number for 
specifiers during IRC handling. 

4.1.10 XBAR Short Literal Data Path 

The XBAR short literal logic outputs a 6-bit short literal specifier, a valid bit, and the 
number of the short literal specifier to the short literal expansion unit of the specifier 
evaluation logic. Figure 4-10 shows the organization of the XBAR short literal logic. 

4.1.10.1 Short Literal Data Select 

The XBAR SL data path receives the following: 

• Instruction buffer data (I_DATA_H[67:64, 59:56, 51:48, 39:32, 31:24, 23:16, 15:08]). 

• SL_MODE_H[08:01] is from the XBAR simple decode logic and the instruction buffer 
simple decode logic. It identifies the bytes that contain short literal specifiers. That 
is, if bit 1 is asserted, then a short literal specifier is detected in the byte 1 position of 
the I-stream. 

• XSCA_REQUEST_H[03:00] is the output of the request logic. 

The specifier decode logic decodes the instruction buffer data and the addressing mode to 
produce the following: 

• SPx SL_H[05:00] contains the 6-bit short literal data for three specifiers (x = 1, 2, or 

3). ~ 

• SPx_SL_H corresponds to the three byte positions that the SL data could be in. SP1_ 
SL_H, asserted, denotes specifier 1 is a short literal specifier. 

The byte position of the short literal specifier (SPx_SL_H) is decoded with XSCA_ 
REQUEST_H[03:00] to produce the short literal specifier offset (SL_SPECIFIER_ 
OFFSET_H[02:00]). This signal selects the short literal data and outputs the field 
(XDTA_SL_H[05:04] and XDTB_SL_H[03:00]) to the SL specifier handler. 
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Figure 4-10 Short Literal Logic 

4.1.10.2 Short Literal Specifier Number 

XDTB_SL_SPECIFIER_NUMBER_H[02:00] defines which specifier of the instruction is 
being passed to the SL specifier handler. This signal is generated by adding the specifier 
byte position to OPU_SPECIFIERS_COMPLETED_H[02:00]. 

4.1.10.3 Short Literal Valid 

XDTB_SL_VALID_H is sent with the valid short literal data to the SLU of the OPU. 
This signal is generated by decoding the input from the request logic (XSCA_REQUEST_ 
H[03:00]) and selecting the valid byte containing the short literal specifier. 
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4.1.11 XBAR Source and Destination Logic 

Each specifier decoded and passed to the specifier handlers has an associated source and 
destination field to identify the function of the specifier. The source and destination fields 
are processed by the specifier handlers and loaded into the EBox pointer queues until the 
instruction they represent is executed. Figure 4-11 shows the inputs and outputs of the 
source and destination logic. 
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Figure 4-11 Source and Destination Logic 
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4.1.12 XBAR Source 1 Data Path 

The source 1 data path logic passes three fields to the FPL and, if the specifier is a 
register, also passes a register field to the mask logic. Figure 4-12 shows the XBAR 
source 1 logic. 

The XBAR source 1 logic passes the low nibble of instruction buffer byte 1 to the FPL 
as the register field (XDTA_SOURCE1_REG_H[03:00]). The source 1 validation logic 
receives the following: 

REGISTER_MODE_H[01] 
XSC A_REQUEST_H[03 :0O] 
SPECIFIER1_ACCESS_TYPE_H 
XSCA_SPECIFIERS_DECODED_H[02:00] 

These fields are decoded to output the two source 1 valid fields: 

• XDTA_SOURCEl REG_VALID H is asserted when the present source 1 operand is a 
register specifier. This field validates XDTA_SOURCE1_REG_H[03:00]. 

• XDTA_SOURCEl_VALID_H validates the specifier as a source operand. 

When SOURCEl_REG_VALID_H is negated and SOURCE 1_VALID_H is asserted, the 
source 1 specifier is a memory operand. 
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Figure 4-12 XBAR Source 1 Logic 
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4.1.13 XBAR Source 2 Data Path 

Source 2 fields are passed to the operand handlers if two source specifiers are decoded. 
The source 2 fields are similar to the source 1 fields in structure, but they rely on the 
source 1 specifier characteristics when they are generated. That is, if source 1 is a 
complex specifier, then the location in the I-stream that contains source 2 is determined 
by case, source 1, and source 2 data types and the number of specifiers completed. 
Figure 4-13 shows the generation the source 2 fields. 

The source 2 select logic selects the byte position that contains the specifier. If the 
specifier is not a register, the byte position is selected, but the field (SP2_REGISTER_ 
H[03:00]) is negated. The inputs to the select logic are as follows: 

• CASE_H[01:00] identifies the location in the instruction of the complex specifier. 

• XSCA_SP1_DATATYPE_H[02:00] identifies the data type of the source 1 specifier. 

• REGISTER_MODE_H[07:01], from register mode logic and instruction buffer simple 
decode logic, identifies which bytes contain register specifiers. 

• INDEX_MODE_H[01], IMMEDIATE_MODE_H[01], and ABSOLUTE_MODE_H[01] 
identify the source 2 specifier as either of these three addressing modes. 

• SP2_ACCESS_TYPE_H provides the access type of the source 2 specifier. 

• XSCA_SPECIFIERS_DECODED_H[02:00] identifies the total number of specifiers 
decoded in the instruction currently being decoded. 

The source 2 register field is validated if the specifier is a valid register mode specifier. 
The specifier 2 access type is input into the valid logic to differentiate between the 
destination and the source specifiers. A write or modify access negates the validation of a 
specifier as a source operand. 
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Figure 4-13 XBAR Source 2 Logic 
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4.1.14 XBAR Destination 

The XBAR destination fields are generated for destination specifiers that are passed to 
the specifier handlers. Figure 4-14 shows a simplified block diagram of the destination 
logic. 
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Figure 4-14 XBAR Destination 

4.1.14.1 Destination Valid 

XDTA_DESTINATION_VALID_H is sent to the FPL of the OPU to validate a destination 
specifier. This signal is generated from access type and specifiers decoded fields. 

The access type of the destination specifier must be write, modify, implied write, or VSRC 
modify. Any specifier with an access type of read is a source specifier. 

To validate the destination specifier, the specifiers decoded must indicate the specifier 
under decode (destination specifier) is the last specifier of the instruction. 

4.1.14.2 Destination Register Valid 

Destination registers are validated when the destination specifier is a register. SPx_ 
REGISTER_DESTINATION_H is asserted when the selected destination specifier is a 
register mode specifier. 
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4.1.15 Register Masks 

The register mask logic receives a register number for any register that is read or written 
to by the EBox during the execution of an instruction. All of the register accesses in an 
instruction are accumulated in one of two registers and passed to the OCTL unit of the 
OPU at the completion of the instruction decode. Figure 4-15 shows the XBAR mask 
logic. 
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Figure 4-15 XBAR Read and Write Masks 

4.1.15.1 Read Mask 

A XBAR read mask is generated by decoding each valid source operand that is passed to 
the FPL. The read mask logic receives the following: 

B_SOURCEx_REG_H[03:00] 
B_SOURCEx_VALID_H 
B_SOURCEx_REG_VALID_H 
XSCA_SPx_DATATYPE_H[02:00] 

The source valid and source register fields are decoded to detect valid register specifiers. 
When a valid register specifier is detected, the register field is stored in the read mask 
logic. 

The data type of the specifier (XSCA_SPx_DATATYPE_H[02:00]> further defines the read 
access of the registers by asserting subsequent register numbers in the read mask when 
the base register data type is quadword or octaword. For example, a quadword read of 
register 1 would generate XDTA_READ_MASK_H[14:00] = 0006. This mask identifies 
register 1 and register 2 as being read during the execution of this instruction. 

READ_MASK_H[ 14:00] is passed to OCTL when XSCA_SHIFTOPCODE_H is asserted. 
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4.1.15.2 Write Mask 

The XBAR write mask field contains entries pertaining to registers that are written to 
during the execution of an instruction. The write mask is generated by decoding the 
following: 

DESTINATION_REG_VALID_H 

DESTINATION_VALID_H 

DESTINATION_REG[03:00] 

XSCA_SP3_DATATYPE_H[02:00] 

The data type and the register field generate DESTINATION_MASK_H[14:00] when the 
data type is as follows: 

Write 
Modify 
Implied write 
VSRC modify 

When both DESTINATION_REG_VALID_H and DESTINATION_VALID_H are asserted, 
an entry is added to the write mask. 

4.1.15.3 Implied Mask 

When a character string instruction is detected, the XRAM asserts IMPLIED_MASK_H, 
which asserts XDTA_WRITE_MASK_H[05:00]. RO through R5 are asserted in the write 
mask because they contain the control block that maintains updated addresses and state 
information during the execution of the instruction. 

4.1.16 Intra-lnstruction Read Conflicts 

Intra-instruction read conflicts (IRCs) occur when a read conflict is in the specifiers 
of a single instruction. These conflicts occur when a register specifier is to be used as 
data in the EBox and is subsequently used as the base register for an autoincrement or 
autodecrement. 

An IRC occurs in the instruction ADDL3 RO, (R0)+, Rl. This instruction directs the EBox 
to read RO and also directs the IBox to use RO as the base register for an autoincrement. 
The IBox detects this IRC by monitoring the read and write masks that are generated 
for each instruction. When the XBAR detects an IRC, it notifies the OPU and passes 
the autoincrement or autodecrement specifier to the CSU of the OPU and also passes all 
subsequent register specifiers through the CSU, instead of through only the FPL. 

The CSU processes the autoincrement specifier, updates the IBox copy of the GPRs, but it 
does not update the EBox copy of the GPRs. Subsequent register specifiers are processed 
by the CSU and passed to the EBox as data and placed in the source list. The data 
passed to the source list is the content of the IBox copy of the register. 
The OPU records the modified register numbers until the instruction has been completed 
and then writes the registers to the EBox GPRs using the data in the IBox GPRs. 

The XBAR does not process any subsequent specifiers until the instruction is complete 
(in the EBox) and then writes the IBox (modified) GPRs to the EBox GPRs. This process 
is called a delayed GPR update. 
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The IRC mask decode logic (shown in Figure 4-16) generates a composite IRC mask 
(IRC_MASK_H[08:00]). The composite IRC mask identifies the data types of GPR 
accesses for the IRC detection logic. 

IRC_MASK_H[08:00] is generated by decoding the read mask and decoding the I-stream 
for autoincrement and autodecrement mode specifiers. IRC_YREG_H[03:00] is also input 
into the IRC mask logic. IRC_YREG_H[03:00] identifies the base register of register 
specifiers in the instruction. 

The IRC composite mask, n (number of specifiers decoded), CASE_[01:00], index mode, 
and the specifier data types are input to the IRC detection logic (Figure 4-17. The Rlxx 
shift count and specifiers decoded count are also input to this logic. The output of the 
IRC detection logic produces the specifier counts and specifier decoded counts during IRC 
handling. 

When an IRC is detected, three outputs are produced to direct the special handling 
required for the instruction: 

IRC_H 

IRC_N_H[00] 
IRC_SC_H[02:00] 

IRC_N_H[00] provides the specifiers decoded count for the IRC and can equal only zero 
or one. IRC_SC_H[02:00] provides the shift count for the IRC. During an IRC, the shift 
counts are based only on the decode tree outputs of Rl, R1BB, R1BW, and R1I. 

The XBAR continues to decode one specifier at a time until the instruction is complete. 
All decoded specifiers are passed through the CSU. Autoincrements or autodecrements 
are performed only on the IBox GPRs. The XBAR stalls (IRC_STALL_H) when the 
instruction is completely decoded and waits for the EBox to complete the instruction. 
When the EBox completes the instruction, a delayed GPR update is performed and the 
XBAR resumes decoding (OSQA_DGPR_UPDATE_FINISHED_H negates IRC_STALL_ 
H). 
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Figure 4-16 IRC Detection: Read Mask 
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Figure 4-17 IRC Detection: IRC Mask 

4.1.17 XBAR Stalls 

XSCA detects stalls in the XBAR. Most of the stalls in the XBAR are related to specifiers 
that require special handling or are related to situations external to the XBAR logic. 
The external situations are a result of the limitations of the logic units that receive 
decoded specifiers from the XBAR or because the instruction buffer and/or the VIC are 
not providing valid I-stream to the XBAR. This section describes the logic that initiates 
XBAR stalls, and it describes the conditions external to the XBAR that cause the XBAR 
to stall. 
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When XBAR_STALL_H is asserted, the XBAR forces the decode tree logic and the request 
logic to produce outputs that generate no shift counts and produce zero for a specifiers 
decoded output. During a XBAR stall, the request logic produces an output that selects 
a nonexistent decode tree output (REQUEST_H[O3:00] = F). Selecting the nonexistent 
decode tree output generates a shift count of zero (XSCA_SHIFTCOUNT_H[03:00] = 0) 
and generates a number of specifiers decoded count of zero (N_H[01:00] = 0). The shift 
count of zero directs the instruction buffer to hold (not shift) the I-stream it is presenting 
for decode and the specifiers decoded count of zero results in no valid specifier data being 
passed to the OPU. 

XBAR_STALL_H is asserted when the following conditions are true: 

IRC_STALL_H is asserted after the XBAR passes all the specifiers of an instruction 
that contains an IRC to the OPU. 

RAF H is asserted, signifying a reserved addressing fault. This fault is detected in 
the XBAR. 

FPD_FLUSH_H is asserted when the EBox stalls the execution of an instruction to 
service an exception or an interrupt. 

PCHI_IBUF_FLUSH_H is asserted. This signal is asserted when there is a change in 

I-stream. 

I_VALID_L[00], when negated, there is no valid opcode in the instruction buffer. 

OCTL_MASK STALLJH is asserted when the OPU read/write register mask logic is 
full. 

OSQA_DECODE_STALL_H is asserted because the OPU cannot receive decoded 
specifiers because one or more of the following conditions exist: 

EBox source list or source pointers are full. 

Read or write conflict stall exists (inter-instruction conflict or scoreboard stall). 

Autoincrement or autodecrement follows a branch. 

Branch follows a branch. 

Two conditional branches are buffered. 

GPR update is delayed. 

The XBAR generates two stall signals, OPU_STALL_H and SL_STALL_H, that are 
related to the availability of resources in the OPU. These signals inhibit the XBAR from 
decoding complex or short literal specifiers when they are asserted. 

When OPU_STALL_H is asserted, this informs the XBAR that the OPU is processing a 
complex specifier and has another complex specifier buffered in the OPU stall logic. This 
stall signal inhibits the XBAR from decoding and passing another complex specifier to the 
OPU. This signal is input to the decode tree logic and affects the shift count and number 
of specifiers decoded outputs. When OPU_STALL_H is asserted and another complex 
specifier is decoded, XBAR_STALL_H is asserted. 

SL_STALL_H is asserted to inform the XBAR that two short literal specifiers are 
currently in the SLU. This stall inhibits decoding another short literal specifier and 
asserts XBAR STALL H when another short literal is encountered. 



DIGITAL INTERNAL USE ONLY 



Instruction Decode 4-27 



4.2 Branch Prediction 

This section describes the VAX 9000 family branch prediction functions. It describes the 
three methods used to predict a branch and the microcode that controls them. 

The three methods used to predict branches are as follows: 

• Primary mode is used when the branch being decoded is stored in the branch 
prediction cache (BPC). The fields in the BPC direct the decision of taking or not 
taking the branch and provide the branch PC. 

• Demote is related to primary mode and occurs when the cached displacement or 
instruction length does not match that of the branch being decoded. Under demote, 
the cached prediction bit is used and the branch PC is provided by the CSU. 

• Secondary mode is used when the branch is not stored in the BPC. The branch 
may not be cached because it has not been encountered before or because it is not a 
cacheable branch. In this mode, the branch PC is provided by the CSU. 

4.2.1 Primary Predictions 

Primary branch predictions use the BPC to direct the flow of I-stream when branches are 
encountered. These predictions are based on the content of the BPC. 

The BPC contains five fields that are written when a cacheable branch is first 
encountered and read when the branch is subsequently encountered. Figure 4-18 shows 
the organization of the BPC. 
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Figure 4-18 BPC Organization 
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The BPC has 1024 locations and is never flushed. The following fields describe the 
contents and functions of the branch prediction cache fields: 

• Branch PC tag — This field contains bits [31:10] of the virtual PC of the branch. 

• Prediction PC — This 32-bit field is the destination PC of the branch. 

• Branch displacement — This 16-bit field is the actual displacement of the branch. 

• Branch instruction length — This 6-bit field contains the actual instruction length 
of the branch. 

• Prediction bit — This bit is set when the branch is predicted taken. 

4.2.1.1 Primary Hits 

A branch prediction hit occurs when the BP tag field matches the corresponding bits in 
the decode PC. A hit directs the cached prediction PC to be loaded into the decode PC if 
the prediction bit is asserted. Figures 4-19 and 4-20 show the compare logic of the BPC 
and the results of the comparisons. 
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Figure 4-19 BPC Compare: Hit 
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When a branch is encountered, the BPC is accessed to perform a compare. The compare 

results in one of three outputs: hit, miss, or demote. 

Two cycles are required to produce a predicted taken branch from the branch prediction 

cache. 

1 In the first cycle, DECODE_PC[31:10] and PREDICTION_PC[31:10] are compared 
and produce a hit or miss. A miss directs the prediction to the secondary mechanism 
and a hit requires subsequent comparisons to produce a hit or a demote. 

2. If a hit is encountered in the PC comparisons, DELTA_PC[05:00] and TAG_ 
INSTRUCTION_LENGTH[05:00] are compared, producing a hit or demote. 

3. The XBAR_DISPLACEMENT[15:00] and TAG_DISPLACEMENT[15:00] are 
compared to produce a demote if they do not match. 

4. If a hit is encountered, BP_PREDICTION_H must be set for the branch to be 
predicted taken. 

4.2.1.2 Tag Match Enable 

The IBUF simple decode logic decodes branch instructions and informs the FUU it tne 
branch is cacheable. When IBFB_CACHEABLE_H is asserted and the BPC is not being 
written, a tag match is enabled. Figure 4-21 shows the logic that enables the tag match. 



IBFB_CACHEABLE_H 



BP_WRITE_TB_L 



DISABLE_BP_HIT_L 




ENABLE_BP_TAG_MATCH_H 



MR.X0QBB_89 



Figure 4-21 BP Tag Match Enable 
4.2.2 Demote 

A primary prediction hit is demoted if the instruction length or the displacement of the 
branch do not match those stored in the BPC on a predicted taken branch. This occurs 
when the virtual PCs of the branch match, but they are not the same branch. 
If the displacement and instruction length do not match, the history bit is checked before 
the branch is demoted. If the branch is predicted not taken (history bit = 0), no demote 
is necessary and decoding of the sequential I-stream continues. When the Prediction 
bit is set, the branch is demoted, the target PC is provided by the CSU, and the BPC is 
written. That is, the new branch PC, instruction length, and displacement are written. 

If the prediction is incorrect, the BPC is rewritten with the prediction bit cleared. 
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4.2.3 Secondary Predictions 

Secondary predictions are performed by the branch bias logic of the XBAR. A secondary 
prediction is based on the opcode of the branch instruction. When a secondary prediction 
directs the IBox to take the branch, the branch target PC is supplied by the CSU. 

Secondary mode is used when the branch has not been cached or when the BPC cannot 
be accessed because it is busy being written. Figure 4— 22 shows the inputs and outputs 
of the branch bias logic. 

The branch bias logic (BRAM) is comprised of 22 scan latches and selection logic. The 
latches are loaded with fixed predictions for branches and accessed when a branch that 
has not been cached is encountered. The opcode is used to select the prediction that is 
stored in the branch bias latches. 

The fixed predictions for the branches are based on a bias for that instruction. That is, a 
BEQL tests two conditions for equality. Because equality in mathematical situations is 
rare, this branch would be predicted not taken. 

A branch that is predicted taken in the BRAM logic asserts XDTB_BRAM_BIAS_H. This 
signal is passed to the branch prediction logic. 



EXTENDED_H 




XDTB 


B_BIAS_H 






l_DATA_H[07:00] 


BRANCH 
BIAS 


B_BIAS_l_ 











WFLX0089 89 



Figure 4-22 Branch Bias Logic 
4.2.4 BPC Correction 

Correction refers to the actions taken to correct an incorrect branch prediction before 
the branch instruction is shifted out of the instruction buffer. Branch instructions 
are predicted when they are shifted into the instruction buffer and are acted upon 
when they are shifted out of the instruction buffer. If the EBox validates the branch 
prediction before it is shifted out of the instruction buffer, and the prediction is incorrect, 
a correction is initiated. 

A correction causes the rewriting (inverting) of the BPC prediction bit and the writing 
of the unwind PC (UNWIND_PC_H[31:00]) to the prediction PC (PREDICTION_PC_ 
H[31:00]). If the branch was predicted to be taken and was incorrect, the IBox continues 
following sequential I-stream. If the branch was predicted to be not taken and was 
incorrect, the IBox redirects the I-stream to the PC supplied by the CSU. In both 
cases, the correct PC (sequential or nonsequential) is loaded from the unwind PC to 
the prediction PC. 
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4.2.5 BPC Unwind 

Unwind refers to the actions necessary to correct an incorrect branch prediction whose 
validation arrives after the instruction buffer has shifted the branch instruction out and 
has started preprocessing the wrong I-stream. A BPC unwind initiates the following 
events: 

• Asserts IBOX.ABORT to abort any instruction buffer requests for new I-stream. 

• Flushes the instruction buffer of any I-stream loaded since the branch instruction. 

• Invalidates any pointers to the EBox source list that have been entered after the 
branch prediction was made. 

• Invalidates any OCTL GPR read/write masks that pertain to instruction decoded 
after the bad branch prediction. 

• Rewrite (invert) the BPC history bit to reflect the correct prediction for the branch. 

• The unwind PC is loaded into the prediction PC so that the IBox can start processing 
the correct I-stream. 

4.3 PCU Microcode 

The PCU and BPC are controlled by PCU microcode. This microcode is implemented 
in hard-coded microcode. That is, the microcode structure is generated in hard logic as 
opposed to a conventional RAM structure. 

The PCU microcode consists of 38 bits that are partitioned into 22 fields. The fields are 
addressed by an 11-bit address field that contains PCU state information and the control 
signals to govern the flow of the microcode. 

4.3.1 PCU Microaddress 

The address for the PCU microcode is 11 bits wide. The address is organized as follows: 

UADDRESS[03:00] provides PCU control signals. 

UADDRESS[06:04] indicates where to store OPU_TARGET_PC. 

UADDRESS[10:07] defines the state of the PCU at the beginning of the current cycle. 

Table 4-3 describes the three address fields. 
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Table 4-3 PCU Microaddress Descriptions 



UADDR£SS[03:00] 



PCU Control 



UADDRESSfOO] 

UADDRESS[01] 

UADDRESS[02] 
UADDRESS[03] 



This bit is set when the CSU has delivered a target PC to the PCU 
and there was not an unwind, last cycle. 

This signal indicates whether an IBox branch prediction was correct. 
The signal is only valid when UADDRESS[02] is asserted. 

Asserted, this bit validates UADDRESS[01]. 

Asserted when an unconditional branch is in the IBUF. 



UADDRESS[06:04] 



OPU Target PC Destination 



UADDRESSfOO 1 

UADDRESStOS] 1 
UADDRESSL06] 1 



This bit directs the target PC to be stored in the unwind PC in the 
PCU. 

This bit directs the target PC to be stored in the decode PC. 

This bit is asserted if the PCU has a branch prediction, predict taken 
hit. This informs the microcode to ignore the target PC. 



UADDRESSf 10:07] 



Label 



Description 



0000 



0001 



0010 



0011 
0100 

0101 
0110 



IDLE 



BSHOP VAL 



VAL UNC TAR VAL TAR 



VAL 

VAL VAL TAR 

BSHOP TAR 
VAL TAR 



The PCU has not encountered a branch 
and the IBox is executing sequential 
code. 

PCU has received a target PC from the 
CSU before the branch has been shifted 
out of the IBUF and before EBox has 
validated prediction. 

XBAR has decoded a conditional, 
unconditional, and another conditional 
branch, in that order. There may 
have been sequential code between the 
branches. The PCU is waiting for EBox 
validation of the first branch, a target 
PC for the unconditional branch, and 
EBox validation and target PC for the 
third branch. 

The PCU is waiting for EBox validation 
of a branch prediction. 

Two conditional branches have been 
shifted out of IBUF. The first needs 
EBox validation and the second needs 
EBox validation and target PC from the 
CSU. 

This state occurs when the branch is 
validated before it is shifted out of the 
IBUF. (EBox is ahead of IBox.) 

XBAR decodes a conditional branch and 
shifts it out of the IBUF. The validation 
and target PC follow a few cycles later. 
(IBox is ahead of EBox.) 



1 These three bits are mutually exclusive. Only one of these bits may be set in a cycle. 
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Table 4-3 (Cont.) PCU Microaddress Descriptions 



UADDRESS[10:07] 

oiii 

1000 

1001 
1010 

1011 
1100 



Label 



Description 



VAL TAR VAL TAR 
TARVALTAR 

TAR 

VAL UNC TAR TAR 

TARTAR 

VAL BSHOP VAL 



1101 



1110 



1111 



VAL UNC TAR 



VAL TAR TAR 



VAL VAL TAR UNC TAR 



Two conditional branches have been 
decoded by XBAR with both awaiting 
validation and target PCs. 

Two branches are being processed. 
The first has been validated but needs 
a target PC. The second needs both 
validation and a target PC. 

A target PC is required from the CSU 
to process a branch completely. 

The PCU is processing three branches. 
The first is conditional and needs 
validation. The second and third are 
unconditional and need target PCs from 
the CSU. 

The CSU has not delivered target PCs 
for two branches. 

An outstanding branch is in the PCU 
and the CSU has provided a target 
PC before the second branch has been 
shifted out of the instruction buffer. 
This state occurs only with JSB or 
BSB. 

The PCU is waiting for a validation 
for one branch and a target PC for a 
subsequent unconditional branch. 

The PCU needs validation and a 
target PC for a conditional branch 
and a target PC for a subsequent 
unconditional branch. 

The IBox has shifted out two 
conditional and one unconditional 
branches. The target PC has been 
delivered for only the first conditional 
branch and the EBox has not validated 
either conditional branch. 
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4.3.2 PCU Microword 



The PCU microword is 39 bits wide and is partitioned into 22 fields. Figure 4-23 provides 
a breakdown of the microword. Table 4—4 describes each microcode field. 
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LOAD MARKERS 
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BRANCH OK SEL 



SEC MARKERS IN USE 



Figure 4-23 PCU Microword Format 
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Table 4-4 PCU Microword Field Descriptions 



[03:00] 



[04] 



Next State 



0000 IDLE 

0001 BSHOP VAL 

0010 VAL UNC TAR VAL TAR 

0011 VAL 

0100 VAL VAL TAR 

0101 BSHOP TAR 

0110 VAL TAR 

0111 VAL TAR VAL TAR 

1000 TAR VAL TAR 

1001 TAR 

1010 VAL UNC TAR TAR 

1011 TAR TAR 

1100 VAL BSHOP VAL 

1101 VAL UNC TAR 

1110 VAL TAR TAR 

1111 VAL VAL TAR UNC TAR 



Load Decode 



Description 



See Table 4-3 for field descriptions. 



Description 



Bad branch prediction. Load unwind PC into decode 
PC or PCU waits for target PC from CSU. Load 
decode is set when destination PC arrives. 



[05] 



[06] 



[07] 



Load Unwind 



Load Delayed Target 



Load Second Unwind 



Description 



Unwind PC is loaded when predict taken branch is 
shifted out of IBUF. On predict not taken, unwind PC 
is loaded when CSU provides destination PC. 



Description 



When the CSU delivers a target PC before branch is 
shifted out, this signal instructs the PCU to latch and 
hold the target PC. 



Description 



Same as load unwind, except the PC is loaded into 
second unwind PC (on second branch prediction). 
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Table 4-4 (Cont.) PCU Microword Field Descriptions 



[08] 



Load Second Branch 



Description 



Asserted when a second branch is shifted out of the 
IBUF. The PCU must latch and hold the following: 

Virtual address 
Prediction bit 
Cacheable 
Loop branch 
Instruction length 
Branch displacement 



[09] 



Use Branch OK 



Description 



Set when EBox sends validation of branch before 
shifting the branch out of IBUF. Remains set until 
branch is shifted. 



[13:10] 



BP Write 



Description 



000 


FALSE 


001 


NO HIT 


010 


PTNL 


011 


CA 


100 


NLCA 



101 



NEXT 



BPC is not written when this state is chosen. 

BPC is written if present branch is not stored and 
cacheable. 

BPC is written in this case if the branch is incorrectly 
predicted taken and is not a loop branch. 

Cacheable branch is encountered for the first time 
and is written to the BPC. 

Not loop, cacheable. Branch predicted incorrectly 
from BPC is rewritten correctly (prediction bit) if it is 
not a loop branch. 

This field enables the write of a second branch being 
processed, to BPC, if the branch is cacheable. 



[15:14] 



B3UF Flush 



Description 



00 


NOOP 


01 


TRUE 


10 


NPT 


11 


PT 



Hold previous values. 
IBUF flush. 
Not predict taken. 
Predict taken. 
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Table 4-4 (Cont.) PCU Microword Field Descriptions 



[18:16] 



Branch Taken 



00 


NOOP 


01 


NPT 


10 


PT 


11 


TRUE 


100 


NOTBT 


101 


SBT 



Description 



Hold previous value. 
Not predict taken. 
Predict taken. 
Predict taken. 
Not branch taken. 
Second branch taken. 



[19] 



Correction 



Description 



Asserted when the EBox informs the IBox of a bad 
branch before it has been shifted out of the IBUF. 



[20] 



[2221] 



00 
01 
10 



[23] 



Invert 



BP Address 



Branch PC 



Decode PC 

1 Second branch PC 



Description 



When the EBox informs the IBox of a bad prediction, 
this signal directs the PCU to invert its fields. 



Description 



If the branch is still in the IBUF, the decode PC 
provides the address. 

If the branch has been shifted out of the IBUF, the 
stored branch PC provides the address. 

Second branch PC is selected by this value. 



Description 



The branch PC is loaded into the decode PC. 

The branch PC is loaded into the second branch PC 
(two branches being processed). 
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Table 4-4 (Cont.) PCU Microword Field Descriptions 



[2724] 



Load Markers 



Description 



0000 
0001 



NOOP 
TRUE 



0010 



0011 



0100 



FALSE 



SET SID 



TAR CASES 



0101 



0110 



0111 



1000 



FROM SECOND 



SETIG 



SECOND 



PUSH 



Load markers with values held last cycle. 

The marker set depends on the prediction of the 
branch and if the branch hits in the BPC. The 
following table provides the value dependent on 
these two variables. 

BPHit 





1 
1 

All markers, first and second, are cleared when all 
branches are completely processed or when EBox 
detects a bad branch prediction. 

This state clears all markers and sets STORE_IN_ 
DECODE. 

This state is set when the validation of a branch 
arrives before the destination PC. This state depends 
on three signals: BRANCH_OK, BP_HIT, and 
PREDICTJTAKEN. 

BPHit 



Predict 


Marker Set 


Taken 







Store in unwind. 


1 


Store in decode. 





Store in unwind. 


1 


Ignore PC. 



Branch 
OK 


B 

















1 





1 


1 





1 





1 


1 


1 


1 



Predict 


Marker Set 


Taken 







Store in decode. 


1 


Ignore PC. 





Store in decode. 


1 


Ignore PC. 





Ignore PC. 


1 


Store in decode. 





Ignore PC. 


1 


Ignore PC. 



When processing two branches, at the completion of 
the first, this field is selected to move the markers 
stored in the second position into the first position. 

When the PCU is awaiting a destination PC from the 
CSU, and the EBox indicates the prediction is wrong, 
markers are switched from store in decode to ignore 
PC. 

When the microcode selects the second set of markers, 
the TRUE state is used as the input. 

This field is only used when going to state VAL VAL 
TAR UNC TAR. The first markers get the content of 
the second marker and the second marker gets loaded 
with appropriate markers for the unconditional 
branch that has just been shifted out of the IBUF. 
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Table 4-4 (Cont.) PCU Microword Field Descriptions 



[2928] 



[31:30] 



00 
01 



[33:32] 



[35:34] 



01 



10 
11 



[36] 



[37] 



[38] 




1 



Target PC Select 



00 OPU results 

01 Unwind PC 

10 Delayed target PC 



Decode PC Select 



Target PC 
Next PC 



Unwind PC Select 



00 Next PC 

01 OPU result 

10 Second unwind PC 



Set Unwind 



Next 



OPU result 
Second unwind 



Load Branch 



Branch OK Select 



Second Markers 



Description 



Target PC provided by the SCU. 
Unwind PC provides target PC. 
Wait for target PC. 



Description 



Target PC is selected on predict taken. 
Next PC is selected on predict not taken. 



Description 



Description 



On a bad branch prediction, this signal informs the 
IBox to unwind from the prediction. This signal is 
used only if the branch has been shifted out of the 
IBUF. Next selects next PC on a bad predict taken. 

Load OPU result on bad not taken prediction. 

Load second unwind PC. 



Description 



This field is the same as load second branch. The 
same information is stored but pertains to the first 
branch. 



Description 



This field can force the validation signal from 
the EBox. The signal indicates that the PCU has 
acknowledged the bad prediction and has corrected 
it without causing any damage to the integrity of the 
IBUF. 



Description 



When a demote occurs, demote the first branch. 
When a demote occurs, demote the second branch. 
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4.3.3 Writing the BPC 

The PCU writes the information pertaining to cacheable branches it has encountered. 
Certain criteria must be met for a branch to be cacheable: 

• Branches must have displacements that will not change during the execution of the 
instruction. 

• Branches must have a displacement as part of the I-stream. 

A branch that has already been written to cache will not be rewritten unless the 
prediction is incorrect. The EBox informs the IBox of an incorrect prediction and the 
prediction bit is inverted. 

4.3.3.1 BPC Write Enable 

The BP write field of the PCU microcode selects the field that enables the BPC write 
signal. Figure 4-24 shows the write enable logic. 
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BP_WRITE_SSL[O2:O0l (MICROCODE 113:10]) 
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MR xoo9'_ee 



Figure 4-24 BPC Write Enable 
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4.3.3.2 Cache Tag Write 

Figure 4-25 shows the BP tag address write logic. To write the tag, either the branch 
PC, decode PC, or second branch PC is selected at the BP tag multiplexer. Selection is as 
follows: 

• Decode PC is selected if the branch is still in the instruction buffer. The branch 
remains in the instruction buffer if the address must be read from or written to. 

• Branch PC is selected after the branch has been shifted out of the instruction buffer. 
The virtual address of the branch is loaded into the branch PC. 

• Second branch PC is used when the PCU is processing two branches. This address 
can be loaded directly into the BPC or into the branch PC when the first branch 
address is no longer needed. 

The selected PC provides bits [31:10] of the virtual address of the branch to be written to 
the BPC tag field. 



DECODE_PC 




BRANCH_PC 




SECOND_BRANCH_PC 







BP ADDRESS 



BP_ADDR£SS_SE L10-I:001 (MICROCODE [22:31 1) [ 

MR_X0092_89 

Figure 4-25 BPC Tag Write 

4.3.3.3 instruction Length Field Write 

The instruction length field is loaded from the delta PC. This PC is derived from the 
accumulated shift counts since the last SHIFT_OPCODE_H. This value, when the 
instruction is shifted out of the instruction buffer, yields the instruction length. 

Figure 4-26 shows the logic generating the detla PC. The delta PC represents the current 
instruction length or, if a second branch is being processed, it is selected as the second 
branch instruction length. 



CUflRENT_INSTRUCTION_LENSTH_H[05:00) 



XBAR_SHIFTCOUNT[05;00| 
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Figure 4-26 BP Instruction Length 
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4.3.3.4 Prediction PC Write 

Figure 4-27 shows the inputs that are written to the BP prediction PC field. 

The time when the PCU writes the prediction PC field depends on the prediction of the 
branch. When a branch is predicted not taken, the field is written when the branch is 
shifted out of the instruction buffer. The value that is written for a branch predicted not 
taken is unpredictable. The field that is written is not valuable because the branch is 
predicted not taken the next time it is encountered. 

A branch that is predicted taken must wait for the target PC to arrive from the CSU 
before it can be written. 

When a branch is predicted incorrectly, the BPC must be updated with the correct 
information. In this case, the unwind PC supplies the correct prediction PC. 



TARGET PC 



OECODE PC 



DECODE PC SELECT 



BP TARGET PC 



MR XOCBi 88 



Figure 4-27 BP Prediction PC Write 

4.3.3.5 BP Displacement Write 

The branch displacement is calculated by the XBAR and, when the branch is cacheable, 
is written to the BPC. Figure 4-28 shows the logic that selects from two inputs to be 
written to the BPC. 

XBAE_DISPLACEMENT_H[15:00] is written to the BPC or latched as SECOND, 
BRANCH_DISPLACEMENT_H[15:00] when two branches are being processed. 



XBAR_DISPLACEMENT[15:00] 



A_SECOND_BRANCH_D!SPLACEMENT[15:00) 



DISPLACEMENT(1S:00] 



MR_X009S_BS 



Figure 4-28 BP Displacement Write 

4.3.3.6 BP Prediction Bit 

The BP prediction bit is the bit that is stored in the BP cache and determines the history 
of the branch. This bit is the reference to whether the branch was predicted taken or 
predicted not taken the last time it was encountered. 

This section provides a detailed description of the reading and writing of the prediction 
bit. 
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4.3.3.6.1 BP Prediction Bit Read 

Two inputs supply the BPC prediction bit during a read operation, (refer to Figure 4-29) 

• BPP_PREDICT - the history bit that is stored in the BPC. 

• XDTB_BRAM_BIAS - the prediction bit supplied by the BRAM bias logic. This 
prediction is the fixed prediction that is based on the opcode of the branch instruction. 

The selection between these two inputs depends on if the branch prediction tag match 

occurs in the BPC. When a tag match occurs, the prediction bit from the BPC is used. 

When there is no tag match in the BPC or the BPC cannot be read because it is currently 

being written, the branch prediction is supplied by the branch bias logic. 

Loop branches and unconditional branches assert FORCE_OUTPUT which always selects 

PREDICTJTAKEN. 

4.3.3.6.2 BP Prediction Bit Write 

The prediction bit is written to the BPC when the branch is initially encountered and 
may also require being rewritten (inverted) when a branch prediction is determined to be 
incorrect. The logic that writes the prediction bit can write a prediction bit for a second 
branch prediction while the preceding branch is still under evaluation. 
In Figure 4-29, TAKEN or PCBP_BP_TAKEN is the output that is used to access the 
BPC history bit during read and write operations. For reads, BP_ADDRESS_SEL_ 
HC01-00] (PCU microcode bits [22:21]) selects A_PREDICT_TAKEN to access the BPC. 
For writes to the BPC, BP_ADDRESS_SEL_H[01:00] selects either BRANCH_TAKEN or 
A_SECOND_BRANCH_TAKEN. Either of these fields is selected when writing the BPC 
on a miss, or for rewriting the BPC on a correction or an unwind. 

BRANCHJTAKEN is selected from one of five different fields by BRANCH JTAKEN. 
SEL_H[02:00]. The selection of the source for this field depends on timing and different 
states or conditions related to the branch instruction. The following list describes the five 
sources and when they are used: 

• The previous output is selected during a noop. 

• The inverted output of PREDICTJTAKEN is selected for a correction. 

• PREDICTJTAKEN is selected for the write path when the branch, misses in the BPC 
and it is an unconditional branch. 

• On a bad not taken prediction, rewriting the BPC is done by selecting the field that 
forces the history bit to be asserted (shown as 1 in Figure 4-29). 

• An unwind uses the latched and inverted version of BRANCH_TAKEN to rewrite the 
history bit. 

• When two branches are being processed, PREDICTJTAKEN is latched and then used 
to rewrite the history bit if it is required (correction or unwind occurs). 

BRANCH_TAKEN_SELECT[02:00] controls the selection of the write field and is supplied 
by the PCU microcode bit [18:16]. 
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Figure 4-29 BP Prediction Bit Write 
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4.3.3.7 BPC Address Selection 

On read and write references, the BPC is addressed by PCBP_BP_ADDRESS[05:00]. 
When reading the BPC, DECODE_PC[09:00] provides the address (Figure 4-30). When 
writing the BPC, BRANCH_PC[09:00] is used. BRANCH_PC is also used to address the 
BPC when rewriting the BPC during a correction or an unwind. 
When a second branch is to be written before the first is validated, the decode PC is 
latched and selected as SECOND_BRANCH_PC[09:00] to address the BPC. 

The ten bits of the address are supplied by two of the PCU MCAs: 

PCBP provides bits [05:00]. 
PCVC provides bites [09:06]. 

The field that selects the BP address (BP_ADDRESS_SEL[01:00]) is provided by the PCU 
microcode bits [22:21]. 



SECOND BRANCH PC 



DECODE PC 



BRANCH PC 



SECOND BRANCH PC 
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Figure 4-30 BPC Address Selection 
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5 

Specifier Decode 



This chapter describes the logic representing the IBox specifier decode pipeline, stage. It 
also describes the three specifier handlers; the interfaces to the MBox, EBox, and VBox; 
PC generation; and related stalls and flushes. 

5.1 Overview 

The XBAR decodes macroinstructions and passes individual specifiers to the OPU 
MCU. Within the OPU, the specifiers are processed by a specifier handler and passed 
as operand data to the EBox or passed as an operand address to the MBox. 

The OPU MCU contains three specifier handlers: 

• Complex specifier unit — The CSU handles all specifiers other than short literal 
and register specifiers. 

• Short literal unit — The SLU expands the short literal and floating short literal 
specifiers into a format and passes them to the EBox source list. 

• Free pointer logic — The FPL manages the pointer to the Ebox source list. The 
free pointer points to the next location in the source list that the IBox can write to. 
Source and destination pointers are passed through this unit to the EBox source and 
destination pointer queues. 

The following list describes the processing of the instruction ADDL3 RO #54 R5 in the 
specifier decode pipeline stage. The example is based on the XBAR having decoded the 
instruction in a single cycle. 

• The FPL receives RO, register valid, and source 1 valid for the first instruction 
specifier. 

• The SLU receives the short literal data (54) and short literal valid bit. The FPL 
receives source 2 valid and register not valid. 

• R5, the register valid bit, and the destination valid bit are sent to the FPL for the 
destination specifier. 

• OCTL receives a 31-bit register mask indicating RO will be read and R5 will be 
written. 
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From these inputs, pointers for each of the specifiers are passed to the EBox in the 
following manner: 

• The source 1 logic passes a 5-bit field to the source pointer queue. The bit field 
indicates that the operand is a register and contains the register number. 

• The short literal specifier (source 2) is expanded by the SLU and passed to the EBox 
source list. The free pointer provides the address in the source list that will receive 
the operand. The 5-bit source 2 pointer is placed in the source pointer queue and 
contains the location in the source list of the short literal operand. 

• The free pointer is incremented by 1. This sets the free pointer to the next free 
location in the source list. 

• The destination pointer is passed to the destination pointer queue and contains a 
5-bit field indicating that the destination is a register and indicating the register 
number. 

• The register mask is stored in the register mask logic and is not discarded until the 
instruction is executed by the EBox. 

If the destination specifier were a register-deferred specifier, it would be processed by the 
CSU. When processing ADDL3 RO #54 (R5), the two source specifiers would process the 
same as previously described. The destination specifier (R5) would be processed by the 
CSU. To process this specifier, the CSU sends the contents of R5 (the destination address) 
to the MBox write queue and the destination pointers to the EBox destination queue. 
The destination pointer contains a bit signifying a memory destination. 

For the instruction CLRW @(R4)+[R5], the CSU performs the following steps: 

1. Sends an OPU port request to the MBox to return the contents of the address 
supplied by R4 and autoincrements R4 by four. 

2. Multiplies the contents of the index register (R5) by two and then adds the data 
returned from the MBox to the contents of R5. The result of the addition produces 
the address of the operand which is sent as an OPU request with the data being 
returned to the EBox source list. 
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5.1.1 Stall Logic 

Figure 5-1 shows a block diagram of the stall buffer that each of the specifier handlers 
has to buffer the inputs from the XBAR. These stall circuits enable the specifier handlers 
to receive two specifiers from the XBAR before they inform the XBAR that it must stall. 

The stall buffer consists of two scan latches, a latch, and a multiplexer. The multiplexer 
provides selection of input data or stalled data to be processed by the specifier handler. 
This data is held in a scan latch until the present specifier sequence is completed. The 
present specifier sequence is represented by current data and can also be stalled data if 
the sequence requires multiple cycles or if the specifier handler has stalled. 

At the start of a specifier sequence, the input data is selected by the following: 

OPU SEQUENCE_START_H in the CSU 
SL_SEQUENCE_START_H in the SLU 
FPL.STALLEDJL in the FPL 

The input data is selected as current data and passed to the functional units of the 
specifier handler. Current data is also passed through the latch and scan latch, and 
placed on the input of the multiplexer. Any subsequent specifier cycles for this operation 
will continue to select stalled data at the multiplexer until the next sequence start. 



STALLED DATA 
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Figure 5-1 OPU Stall Logic 
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5.2 Complex Specifier Unit 

The complex specifier unit (CSU) provides the following functions: 

Handles complex specifiers. 

Calculates operand addresses and branch target addresses. 

Controls the MBox OPU port. 

Provides IBox data control to the EBox. 

Performs GPR and RLOG write operations. 

The CSU is contained in the OPU MCU and is divided across five MCAs. The MCAs 
provide the following functions: 

• OPUA — Low word of the data path 

• OPUB — High word of the data path 

• OSQA — Control unit for RLOG, GPRs, OPUA, OPUB, and the OPU port and EBox 
interfaces 

• STG2 — Low word of the GPRs 

• STG3 — High word of the GPRs 

Figure 5-2 is a basic block diagram showing the organization of the CSU. 




Figure 5-2 CSU Organization 
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The OPUA and OPUB MCAs contain the adders that calculate the target addresses of 
operands for the EBox. A context shifter (multiplier) is contained in the OPUA logic. 
The OPU port and the EBox interfaces are physically contained in these two MCAs and 
receive control from OSQA and OSQB. 

The GPRs are dual port STREGs that can be read from and written to under control 
provided by OSQA These GPRs can also receive inputs from the VBox. The VBox sends 
addresses to the MBox through the OPU. 

5.2.1 OPUA Data Path 

The OPUA MCA controls bits [15:00] of the address and data path of the CSU. OPUA 
contains an adder that receives inputs that are added together to calculate the operand 
data or address. Two multiplexers, AMUX and BMUX, supply the data to the adder from 
a variety of sources. When processing a displacement mode specifier, the AMUX provides 
the register (YREG) to the adder and the BMUX provides the sign-extended data to the 
adder The adder adds the two inputs together and produces the operand address. (For 
example, OPUA produces the low 16 bits of the address while OPUB produces the high 
16 bits.) 

OPUA receives displacement data from the XBAR, sign extends the data, and presents 
it to the adder of the MCA as an operand. OPUA also accesses the GPRs. The CSU 
microcode selects the relevant operands for the specifier that is currently being evaluated 
and adds them together. The output of this operation can be used as an address for an 
OPU port request, passed as data to the EBox, sent to the PCU as a target PC or placed 
on the input of the adder for the next cycle operation. Figure 5-3 shows a block diagram 
of the data and address paths of the OPUA MCA. 

The AMUX receives XBAR DISPLACEMENT_H[15:00], the low 16 bits of the 32 bits 
of XBAR displacement. This input (XBAR_DISPLACEMENT_H[15:00]) represents the 
following: 

• Byte, word, or longword displacement for addressing modes A through F 

A Byte displacement 

B Byte displacement deferred 

C Word displacement 

D Word displacement deferred 

E Longword displacement 

F Longword displacement deferred 

• Immediate data for addressing mode 8F 

• Absolute address for addressing mode 9F 

• Byte, word, or longword displacement for relative addressing modes (AF through FF) 

• Byte or word displacements for branch instructions 

XBAR DISPLACEMENT_H[15:00] is sign extended to produce SXDISP_H[15:00], which 
is then placed on the input of the two multiplexers (AMUX and BMUX) that provide the 
data to the adder circuit. 
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Figure 5-3 OPUA 
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5.2.1.1 AMUX Inputs 

The AMUX receives six inputs, one of which is selected by the OPU microcode (AMUX 
SELECT_H[02:00]) and passed as one half of the operand to the adder. The inputs to the 
AMUX are as follows: 

• SXDISP[15:00] — XBAR displacement that has been sign extended. 

• CURRENT PC_H[15:00] — The current updated PC. This PC is generated by 
adding XBAR_DECODE_PC and DECODE_DELTA and is used for calculating branch 
target PCs and in PC relative addressing. 

• OPU data — Data that has been returned from the MBox in response to an OPU 
port request when processing an indirect addressing mode specifier. 

• STG2 YGPR_H[15:00] — YGPR is the GPR that is being referenced in the present 
specifier cycle? YGPR represents any GPR reference except for indexed specifiers. 

• Delayed GPR data — YGPR data that has not been written to the EBox GPRs. 
Delayed GPR data is used when processing intra-instruction read conflicts (IRCs). 

• OPUA result — The result of the last addition (adder output) that is placed back on 
the input of the AMUX for further processing. 

5.2.1.2 BMUX Inputs . , D ,, TTV 

The BMUX receives six inputs, one of which is selected by the OPU microcode (BMUX_ 
SELECT_H[02:00]) and passed as the operand to the context shifter and then the adder. 
The inputs to the BMUX are as follows: 

• OPUA result — The result of the last addition (adder output) that is placed back on 
the input of the BMUX for further processing. 

• SXDISP[15:00] — Displacement data provided by the XBAR. 

• STG2 XGPR — The input to the BMUX to represent which register is being accessed 
for an indexed operation. 

• Constant -1 — Selected for autodecrement operations. 

• Constant +1 — Selected for autoincrement operations. 

The BMUX output is passed to the context shifter, which multiplies the output by 0, 
1, 2, 4, 8, or 16. The multiplication type is selected by the OPU microcode (CSHFT_ 
SEL_H[o'200]) for index, autoincrement, autodecrement, and autoincrement-deferred 
operations. BMUX inputs are passed unchanged through the CSHFT logic by selecting 
zeros as the function. 
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5.2.1.3 Adder 

The output of the AMUX multiplexer (AMUX_H[15:00]) is input to a 16-bit adder and 

added with the output of the context shifter (CSFT_H[15:00]) to produce RESULT. 

H[ 15:00]. Any carry produced from the addition is sent to a similar add circuit in OPUB. 

The output of the OPUA adder (RESULT_H[15:00]) is passed to one of the following 
signals to the other functional units of the MBox, EBox, and IBox: 

• IBOX_DATA_H[15:00] is passed to the EBox source list as an operand. 

• IBOX_OP_ADDRESS_H[ 15:00] sends result data to the MBox as the address of an 
operand with data being returned to the IBox or to the EBox source list, or as the 
destination address of an operand where the EBox is to write the destination data. 

• OPUA_RESULT_H[ 15:00] is selected as the output and sent to the PCU as the 
target address of a branch instruction. This output (OPUA_RESULT_H[ 15:00]) can 
also be selected and written to a GPR (STG2) that was updated (autoincrement or 
autodecrement operation). OPUA_RESULT_H[ 15:00] can also be selected as an input 
to AMUX or BMUX for subsequent add operations. 

5.2.2 OPUB Data Path 

The OPUB MCA performs functions similar to OPUA. This MCA receives the high-order 
bytes of XBAR displacement data (XBAR_DISPLACEMENT_H[31:16]) and produces the 
high-order bytes of the operands or operand addresses and GPR update data. 

Figure 5-4 shows a block diagram of the OPUB MCA. OPUA contains an AMUX, a 
BMUX, and a context shifter. As in OPUA, the output of the AMUX is added to the 
output of the BMUX and context shifter to produce the operand address or the operand 
data. OPUB uses two adder circuits. The two outputs they produce are "add with carry" 
and "add without carry." (For example, the high-order 16 bits of the operand are added 
together, producing an output that assumes a carry and a no carry from the low-order 
addition.) The output of the two adders are placed on a multiplexer and selected by the 
carry bit from the adder of OPUA (OPUA_ADDER_COUNT_H). 
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Figure 5-4 OPUB 
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5.2.3 Current PC Generation 

The CSU calculates the current PC for branch instruction target addresses and for PC 
relative addressing mode specifiers. The current PC generation logic is divided between 
the OPUA and OPUB MCAs. OPUA generates the low-order word of the current PC 
while OPUB generates the high-order word. This section provides a detailed description 
of the current PC logic. 

Figures 5-5 and 5-6 are detailed block diagrams describing the current PC generation in 
the CSU. 
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Figure 5-5 CSU Current PC (High Slice) 



MR XC<OI_M 



PC_OPLLDECOP£_PC_H[1 5:00)_ 



SCAN 
LATCH 



SPECIFIER_DECODE_PC|1S:00] 



> 



XDTB_DECODE_DELTA_H[05:00) 



SCAN 
LATCH 



SPECIFIER_D£CODE_DELTA_H[15:00] 



OSOA_OPU_SEO_START_H 



SPECIFIER_PC_H|15:0O] 



. OPUB 
(CARRY) 



SCAN 
LATCH 



CURRENT_PC_H|I5:00| 



TB i^ 

^ LATCH "" * 



SCAN 
LATCH 



STALLED_PC_M|1S:001 



INVERT_SPECIFIER_PC_PARITY_H|01:00] 



Figure 5-6 CSU Current PC (Low Slice) 
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5.2.3.1 OPUA Current PC [15:00] 

OPUA receives PCxx_OPU_DECODE_PC_H[15:00] from the PCU. These 16 bits are 
delivered in the following manner: 

PCBP sends bits [07:00]. 
PCVC sends bits [12:08]. 
PCLO sends bits [15:13]. 

The XBAR sends XDTB_DECODE_DELTA_H[05:00], which equals the total number of 
instruction buffer bytes that have been shifted out of the instruction buffer since the last 
SHIFTOPCODE was asserted. 

These two signals are latched and then added (Figure 5-6) together to produce 
SPECIFIER_PC_H[15:00], which is latched and produces an output of CURRENT_PC_ 
H[1500] A^arry from the addition results in a carry bit being asserted in the logic that 
generates the high-order word of the current PC. CURRENT_PC_H[15:00] is an AMUX 
input in the OPUA data path of the CSU. 

This logic also contains a stall buffer that controls selection of a current PC output or a 
stalled output (STALLED_PC_H[15:00]). The stalled PC is selected when the CSU is in 
sequence. 

5.2.3.2 OPUB Current PC [31:16] 

The high-order word for the current PC (CURRENT_PC_H[31:16]) is generated in OPUB 
and is an input to the AMUX in the OPUB data path of the CSU. Figure 5-5 shows a 
block diagram of the logic that generates the high-order word of the current PC. 

OPUB receives PCxx_OPU_DECODE_PC_H[31:16] from the PCU. Theses 16 bits are 
delivered in the following manner: 

PCHI supplies bits [31:24]. 

PCLO supplies bits [23:16]. 
These two PC fields are latched and produce SPECIFIER_DECODE_PC_H[31:16], 
which is input to a multiplexer for final selection of the current PC. Also input to the 
multiplexer is an incremented version of SPECIFIER_DECODE_PC_H[31:16] (PC_PLUS_ 
CIN_H[31:16]). The incremented PC is selected when the calculation of the low-order 
word of the current PC results in a carry. 

This logic also contains stall circuitry similar to that in the logic that generates the 
low-order word of the current PC. 
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5.2.4 CSU Microcode 

The CSU is controlled by microcode logic that is resident on the OSQA MCA. The 
microcode controls the multiplexers that supply inputs to the adders of the CSU. Control 
of the OPU port interface, GPR writes, YGPR incrementing, and source list writes is also 
provided under microcode control. 

5.2.4.1 CSU Microaddress 

The CSU microaddress is 10 bits wide and contains 4 fields that govern the flow of the 
microcode (Figure 5-7). Table 5-1 lists the fields of the microaddress. 



09 08 05 04 



02 01 00 



INDEX 


MODE 


ACCESS TYPE 


COUNT 



MR_XD270_8S 



Figure 5-7 CSU Microaddress Format 
Table 5-1 CSU Microaddress Descriptions 



Index [09] 


Indexed Indication 


Label 







Nonindexed specifier 


NI 




1 


Indexed specifier 


I 










Original Mode 


Mode [08:05] 


Specifier Mode 


Label 


Values 


0000 


Autoincrement 


AINC 


8x 


0001 


Autoincrement deferred 


AINCDEF 


9x 


0010 


Displacement 


DISP 


Ax, Cx, Ex 


0011 


Displacement deferred 


DISPDEF 


Bx, Dx, Fx 


0100 


Unused 


4 


Ox, lx, 2x, 3x, 4x 


0101 


Register 


REG 


5X 1 


0110 


Register deferred 


REGDEF 


6x 


0111 


Autodecrement 


ADEC 


7x 


1000 


Immediate 


IMM 


8F 


1001 


Absolute 


ABS 


9F 


1010 


Relative 


REL 


AF, CF, EF 


1011 


Relative deferred 


RELDEF 


BF, DF, FF 


1100 


Unused 


12 


OF, IF, 2F, 3F, 4F 


1101 


Unused 


13 


5F 


1110 


PC deferred 


PCDEF 


6F 


1111 


PC autodecrement 


PCADEC 


7F 


Access Type 








[04:02] 


Specifier Access Type 


Label 




000 


Address (ASRC) 


A 




001 


Read 


R 




010 


Write 


W 





'This specifier mode is selected when IRC is asserted in the XBAR. 
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Table 5-1 (Com.) CSU Microaddress Descriptions 



Access Type 
[04:02] 



Oil 
100 
101 
110 

111 



Specifier Access Type 



Modify 

Vield source (VSRC) 

Branch displacement 

Implied 

Callx specifier 



Label 



M 

V 

B 

I 

C 



Count [01:00] 



00 
01 
10 

11 



Microword in Sequence 



First microword 
Second microword 
Third microword 
Fourth microword 



Label 





1 
2 
3 



Register Mode 
Count [01:00] 



00 
01 
10 

11 



Register Mode Size 



Longword data type 
Octaword data type 
Three longwords 
Quadword data type 



Label 



5.2.4.2 CSU Microword 

The CSU microword is 19 bits wide and is divided into 12 fields (Figure 5-8). Table 5-2 
lists the fields of the microword. 



.., WRITE MUL BMUX : AMUX 

EOS SLIST FACTOR SELECT SELECT] 



MICROWORD INC 
PARITY YREO ' 



WRITE 

GPRS 



WRITE 
PC 



MREO 
DEST 



MREO 
SIZE 



Figure 5-8 CSU Microword Format 
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Table 5-2 CSU Microword Field Descriptions 



Bits 



Name 



Value Description 



03:00 



07:04 



AMUX select 



BMUX select 



000 Previous cycle's adder output. 

001 Sign-extended I-stream data. 

010 Current specifier's PC. 

011 OPU port data returned to OPU. 

100 Contents of specifier's base GPE. 

101 Contents of current delayed GPR. 

000 Previous cycle's adder output. 

001 Sign-extended I-stream data. 
010 Constant value of one. 

100 Contents of specifier's index GPR. 

101 Constant value of minus one. 



11:08 Multiply factor 000 Multiply BMUX output by zero. 

001 Multiply BMUX output by one. 

010 Multiply BMUX output by two. 

011 Multiply BMUX output by four. 
100 Use specifier's context. 



12 



Write source list 



Don't write adder result to source list. 

1 Write adder result to source list. 



13 EOS 

14 MREQ 

16:15 MREQ address select 

19:17 MREQ size 

21:20 MREQ destination 



Specifier sequence continues. 

1 End of specifier sequence. 

Don't issue an MBox OPU port request. 

1 Issue an MBox OPU port request. 

Use adder result as OPU port address. 

1 Use AMUX output as OPU port address. 

00 Use specifier's context as request size. 

01 Force request size to word. 

10 Force request size to longword. 

11 Force request size to quadword. 

Return data to OPU. 

1 Write data to EBox source list. 



22 



23 



Write PC 



Write GPRs 



Don't send target PC. 

1 Send target PC if appropriate. 

Don't write GPRs. 

1 Write IBox and EBox GPRs. 



24 



25 



Increment YREG 



Microword parity 



Don't increment base GPR pointer. 

1 Increment base GPR pointer. 

Parity bit disabled, microword has odd number of bits. 

1 Parity bit enabled, microword has even number of bits. 
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Figure 5-9 shows signals involved in generating a CSU microword. The following list 
describes the source of each signal used to generate the CSU microword: 

• OSQB_ACCESS_TYPE_H[03:00] is from the OSQB DRAM and defines the access 
type of the current specifier. 

• XDTB_INDEXED_H is from the XBAR and is asserted when the specifier is indexed 
mode. 

• XDTB_MODE_H[03:00] is from the XBAR and provides the addressing mode of the 
specifier. 

• XDTA_XREG_H[03:00] is from the XBAR and provides the base register for any 
register operands. 

A count field is also input to the microaddress generation to control the count of the 
CSU sequence. The count field is generated by decoding the inputs to the microaddress 
generation logic. 

5.2.5 CSU Stalls 

In the CSU, the OSQA MCA contains the logic that monitors conditions that may initiate 
stalls. The stalls that occur in this unit can be related to handling certain specifiers 
(read and write conflict stalls) or related to the units that the CSU is supplying data to. 
This section describes the stalls that occur in the CSU, and it describes how each stall is 
detected and how each stall is cleared. 

5.2.5.1 Scoreboard Stalls 

The OCTL MCA contains the scoreboard logic that tracks the reading and writing of 
GPRs by the EBox during the execution of macroinstructions. Read and write conflicts 
occur when the IBox is directed to read or write a GPR before the EBox has performed a 
required operation on the same GPR. When these conflicts occur, the CSU must stall and 
wait for the EBox to complete the conflicting instruction before it can proceed. 

Two types of read and write conflict stalls can be asserted in the CSU. The two types of 
stalls are caused by the same conflicts but differ only in the timing of the detection of 
the stall. For example, a read conflict can initiate a stall by asserting a SPECIFIER. 
READ_CONFLICT_H or a CURRENT_READ_CONFLICT_H. A specifier read conflict 
occurs~when the conflicting data is latched from the XBAR but has not yet been acted 
on by the CSU. A current read conflict occurs after the CSU begins evaluating the data 
(OPU_SEQ_START_H is asserted). 

This section describes the initiation of read conflict and write conflict stalls, and it 
describes the clearing of these conditions. 
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Figure 5-9 CSU Microword Select 
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5.2.5.1.1 Read Conflict Stall 

Read conflicts are detected by OSQA and initiate scoreboard stalls. A read conflict ^occurs 
when the CSU is directed to update (autoincrement or autodecrement) a GPR that the 
EBox is directed to read. For example: 

MOVL RO, xx (read RO) 

ADDL2 yy, (R0)+ (autoincrement RO) 

The above sequence of instructions causes a read stall because the EBox is directed to 
read RO and the IBox is directed to update RO. Depending on how many instructions 
ahead of the EBox the IBox is, the IBox could possibly update RO before the EBox reads 
RO. 

Read conflicts are detected by monitoring the read mask (READ_MASK_H[15:00]), 
the specifier GPR (YREG_H[03:00]), and the occurrence of an autoincrement or 
autodecrement (AUTOXXX_MODE_H). When the read mask and YREG both point to 
the same register and an autoincrement or autodecrement is to be performed by the UbU, 
SPECIFIER READ_CONFLICT_H and SPECIFIER_READ_STALL_H are asserted in 
OSQA. These stall signals also assert OPU.STALLEDJi and SCOREBOARD_STALL_H. 
The stall is cleared by a flush or unwind or when the instruction containing the conflict 
is executed by the EBox. When a flush or BP unwind occurs, INIT_OPU_AND_SL_H 
is asserted and any read or write conflicts are negated. When the EBox completes an 
instruction EBOX_INSTRUCTION_DONE_H is asserted. This signals directs OCTL to 
discard the masks related to the completed instruction. If the mask that is discarded is 
the one that asserted the stall, then the CSU resumes operation. 

5.2.5.1.2 Write Conflict Stall 

Write conflicts are similar to read conflicts in the CSU. Write conflicts occur when the 

EBox is directed to write a GPR and the IBox is directed to read the same GPR. 

Write conflicts are detected by monitoring the write mask (WRITE_MASK_H[15K)0]) and 

the specifier GPR (YREG_H[03:00]) for register mode addressing. When a match ot the 

write mask and a YREG is detected, a write conflict is asserted. 

A write conflict can also occur when processing index mode specifiers. OSQA compares 

the index register (XREG_H[03:00]) with the write mask to detect this conflict. 

Write conflicts assert CURRENT_WRITE_STALL_H or SPECIFIER ^WMT E STALL_H 

which asserts OPU_STALLED_H to stall the CSU and asserts SCOREBOARD_STALL_H 

in the OCTL MCA. These stalls are cleared by an EBox flush, a PCU unwind, or when 

the EBox has completed the conflicting instruction and directs the OCTL mask logic to 

discard the write mask associated with it. 
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5.2.5.2 Branch Under Branch Stall 

The CSU stalls when a conditional branch is encountered, if an outstanding branch has 
not yet been validated by the EBox. OSQA maintains a count of conditional branches 
that have been processed by the CSU. The unconditional branch count is incremented 
each time the CSU sends a target PC to the PCU (TARGET_PC_H[31:00], and TARGET, 
VALID_H) and the current instruction is an unconditional branch (CURRENT. 
UNCOND_BRANCH_TB_L negated). 

The branch under branch stall (CURRENT_BUB_STALL_H) occurs when the following 
conditions exist: 

• BRANCH_CNT_EQL_1_H is asserted. 

• SPECIFIER_BRANCH_ACCESS_TYPE_H is asserted. This signal is from the OSQB 
DRAM access type. 

• SPECIFIER_DATA_AVAILABLE_H is asserted. This signal is asserted when XSCA_ 
VALID_H is asserted and XDTA_RAF_L is negated. 

• SPECIFIER_UNCOND_BRANCH_L is asserted in the instruction buffer simple 
decode logic. 

• OPU_SEQ_START_H is asserted. 

CURRENT_BUB_STALL_H asserts OPU_STALLED_H. The stall is cleared by an EBox 
flush, PCU unwind, or when the first branch is validated by the EBox (EBOX BRANCH 
VALID_H). " 

5.2.5.3 AUTOxx Under Branch Stall 

The CSU does not process autoincrement and autodecrement specifiers when an 
outstanding branch prediction is awaiting EBox validation. When an autoincrement, 
autodecrement, or autoincrement-deferred mode specifier is detected and the branch 
count is equal to one, the CSU stalls and waits for the branch prediction to be validated 
before it continues operation. 

CURRENT_AUB_STALL_H is asserted to initiate this stall when the following conditions 
exist: 

SPEC_NOT_BRANCH_OR_IMPLIED_H is asserted. 

SPECIFIER_DATA_AVAILABLE_H is asserted. 

SPECIFIER_AUTOXX_MODE_H is asserted. 

BRANCH_COUNT_EQL_l_H is asserted. 

OPU_SEQ_START_H is asserted. 

Asserting CURRENT_AUB_STALL_H also asserts OPU_STALLED_H. The stall is 
cleared by an EBox flush, a PCU unwind, or when the EBox validates the outstanding 
branch prediction. 

During this stall, it is possible for the CSU to initiate an OPU port request to handle the 
AUTOxx specifier. For an autoincrement specifier, the CSU can read the GPR and then 
write (autoincrement) the GPR when the stall is negated. 
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5.2.5.4 OPU Port Grant Wait Stall 

When the CSU issues an OPU port request, the MBox must respond to the request before 
the CSU can continue processing. When the MBox does not respond, the CSU stalls. 
When an OPU port request is issued, the CSU selects stalled A and B addends as input 
to the CSU adder by asserting OSQA_OP_GRANT_WAIT_H. This holds the port request 
information until the MBox responds. 

OP GRANT_WAIT_H asserts OPU_STALLED_H if another OPU port request is issued. 
The stall is cleared when the MBox responds to the request by asserting MBOX_OP_ 
GRANT_TA_L or if an EBox flush or PCU unwind occurs. 

5.3 Short Literal Unit 

The short literal unit (SLU) expands the 6-bit integer and floating short literal (SL) 
operands into a relevant field (32-, 64-, or 128-bit field). 

Figure 5-10 shows the inputs and outputs of the SLU. This unit receives XBAR_ 
SLE05-00] and a copy of the opcode (OPCODE[08:001) from the XBAR. The opcode is 
decoded to determine the SL data type. The data type and the SL data are input to the 
SL data formatter, which outputs the expanded SL data. 

The SLU can produce a single longword of expansion per cycle. Data formats larger than 
a longword require successive cycles to produce their output. 
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Figure 5-10 Input and Outputs of the Short Literal Unit 
5.3.1 Short Literal Processing 

Figure 5-11 shows a block diagram of the SLU. The SL formatter provides integer and F- 
, D-, G-, and H-floating formats to the format select logic. The SL formatter receives the 
6-bit short literal data (XBAR_SL_H[05:00j; and expands it into four formats (INTEGER. 
H[31:00], F/D_FLOAT_H[31:00], G_FLOAT_H[31:00], and H_FLOAT_H[31:00]). 

The expansion select logic selects one of the four SL formatter outputs, PREVIOUS. 
EXPANSION_H[31:00], or zero. PREVIOUS_EXPANSION_H[31:00] is selected when 
short literal data that was previously expanded was not written to the EBox source list 
(for example, IBox-to-EBox interface is busy). Zero is selected in cycles other than the 
first in multiple cycle expansion (for example, D-, G-, and H-floating formats). The data 
type of the specifier being processed (SL_DATA_TYPE_H[03:00]) and a signal monitoring 
the status of the EBox source list (SLIST_FULL_H) are input to expansion select. The 
data type is generated by decoding instruction opcodes in the OSQB access type and data 
type logic and provides selection of the expanded short literal data. 
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Figure 5-11 Short Literal Unit Block Diagram 

Short literal sequences start when SL_SEQUENCE_STAET_H is asserted. Assertion of 
this signal selects valid short literal data (XBAR_SL_H[05:00]) from the input stall buffer 
to be loaded into the SL formatting logic. SL_SEQUENCE_START_H is asserted when 
the following occurs: 

XDTB_SL_VALID_H is asserted, signifying that the XBAR is passing valid short 
literal data to the SLU. 

XDTA_RAF_L is negated, signifying that the XBAR did not detect an RAE 

SL_LWORDS_LEFT_TA_EQL_0_H is asserted, signifying that no short literal 
specifiers are currently being processed. 

PREV_SLIST_FULL_L is negated, signifying that the source list is not full. 

INIT_OPU_AND_SL_L is negated, signifying that no flush or PCU unwind is in 
progress. 
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At the start of a short literal sequence, SL_LWORDS_H[02:00] is loaded into SL_ 
LWORDS.LEFT H[02:00], SL_LWORDS_H[02:00] is originally loaded with the value 
or count that is derived from the data type of the current SL specifier being evaluated. 
For byte, word, and longword formats, the count equals one. For quadword formats, the 
count equals two; for octaword formats, the count equals four. 

When CURRENT_SL_LWORDS_H[02:00] equals SL_LWORDS_LEFTH[02:00], OSQA_ 
SL_WRITES_SLIST H is asserted to write expanded short literal data to the source list. 
Each time the write'signal is asserted, SL_LWORDS_LEFT_H[02:00] is decremented. 
When the longword count is decremented to zero, the SL sequence is complete. 

5.3.2 Integer Expansion 

Integer short literal operands represent values from to 63. The short literal operands 
are zero extended, according to their data type, and passed to the EBox source list. 

Figure 5-12 shows the outputs of the SLU when processing an integer short literal 
operand. In this example, the zero expansion of 63 occurs in a single cycle for the byte, 
word, and longword contexts. Quadword outputs occur in a minimum of two cycles. 
Producing the octaword format requires a minimum of four cycles. 
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Figure 5-12 SLU Integer Expansion 
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5.3.3 Floating-Point Expansion 

Figure 5-13 shows the format of the 6-bit short literal field when representing a floating- 
point number. 

The 6-bit floating short literal field is expanded to produce the relevant F-, D-, G-, and 
H-floating operands. Figures 5-14, 5-15, 5-16, and 5-17 show the format of each type of 
floating output from the SLU. 

Bit 3 of SL_DATA_TYPE_H[03:00] is asserted to indicate a floating point literal field and 
also provides the most significant bit (MSB) of the exponent. The actual expansion is 
performed by correctly positioning the exponent and fraction fields. 
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Figure 5-13 SLU Floating Point Literal Format 
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Figure 5-14 SLU F-Floating Expansion 
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Figure 5-15 SLU D-Floating Expansion 
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Figure 5-16 SLU G-Floating Expansion 
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Figure 5-17 SLU H-Floating Expansion 
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5.3.4 Outputs to the EBox Interface 

The 14-bit short literal operand is passed as 32 bits of short literal data to the EBox 
across the EBox interface. This field is constructed by zeroing out the remaining 18 bits 
as the data is passed to the EBox source list. 

5.3.4.1 Order 

When specifiers from the CSU and SLU are both valid, XBAR_SL_ORDER is used to 
keep the operand entries to the free pointer in order. When asserted, the short literal 
specifier precedes the complex specifier. This signal is generated by the XBAR. 

5.3.5 Stalls 

The SLU stalls for three reasons: 

• The unit is processing a multicycle SL specifier. 

• The EBox interface cannot accept the expanded SL data because the OPU is using 
the interface. 

• The EBox queues are full. 

5.3.5.1 Source List Full 

The FPL tracks the available free slots in the source list. When the source list is full, 
OSQB_SLIST_FULL_H is asserted in the OSQB MCA and passed to the OSQA MCA. 
The receipt of this signal disables any SLU or CSU writes across the EBox interface to 
the source list. 

5.3.5.2 SLU Stalled 

The XBAR must be informed of the status of the SLU so that when the SLU is busy 
processing one specifier, another is not passed to it. OSQA_SL_BUSY_STALL_H is 
asserted and sent to the XBAR to signify that the SLU is busy processing a specifier. This 
signal is an output of the comparator, in the SLU, that tracks the number of longwords 
remaining in a SL expansion. 

5.3.5.3 EBox Interface Output Stall 

The EBox interface cannot pass both SL and CS data to the EBox simultaneously. If SL 
data is available to be written to the source list and the EBox interface is busy with CS 
data, SL_BUSY_STALL_H is asserted until the interface is free to accept the SL data. 

5.3.6 Parity Coverage and Errors 

Three key signals of the SLU are parity checked before the SL specifier is processed: 

XDTB_SL_SPECIFIER_NUMBER_H[02:00] 

XDTB_ORDER_H 

XBAR_SL_H[05:00] 
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5.4 Free Pointer Logic 

The free pointer logic (FPL) processes and passes pointers that are used by the EBox to 
direct the execution of instructions: 

Source 1 pointer 

Source 2 pointer 

Destination pointer 

Free pointer 
The EBox maintains queues for the source and destination pointers it receives from the 
IBox The source pointers point to entries in the source list or to GPRs. The source list 
is a queue for storing operands. These operands have either been passed by the IBox or 
prefetched from memory on behalf of the EBox. 

To manage the source list, the IBox generates a free pointer. The free pointer points to 
the next free location in the source list that the IBox will write to. 

The structure of the EBox queues is as follows: 

• Source pointer queue — A 5-bit wide x 16-location deep queue. The top bit of 
the entry specifies if the operand is in the source list or a GPR. The remaining bits 
contain the address, in the source list, or the GPR number. 

• Destination pointer queue — A 5-bit wide x 8-bit deep queue. The top bit in 

the queue specifies if the destination is memory or a GPR, while the remaining bits 
contain the GPR number when applicable. 

• Source list — A 16-entry circular queue for memory, immediate, and short literal 
operands. Each time an entry is inserted into the source list, the free pointer must 
be incremented to point to the next free location. If the operand size is a quadword 
or octaword, the free pointer must be incremented to reflect the size. 

Figure 5-18 shows a simplified block diagram of the FPL. 

5.4.1 Source 1 Pointer 

The source 1 pointer is generated and validated in XDTA and passed to the FPL logic in 
OSQB. Three signals deliver the source 1 information: 

• XDTA_SOURCE_REG_H[03:00] is the register number containing the operand if it is 
a register operand. 

• XDTA_SOURCE_REG_VALID_H is asserted when the operand is a register. 

• XDTA_SOURCE_VALID_H is asserted when a valid operand is being passed. 

These signals allow the FPL to generate a pointer, validate the pointer, and, if the 
operand is not a register, allocate the source list entry and update the free pointer. 

Figure 5-19 shows a block diagram of the source 1 pointer logic. The source 1 register 
field (XDTA_SRC1_REG_H[03:00]) is placed on the input of the pointer select multiplexer, 
if no stalls are present. 
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Figure 5-19 FPL Source 1 Pointer Logic 

The register valid signal (XDTA_SRC1_REG_VALID) is latched from the XBAR and 

provides the select at the pointer multiplexer. Register valid asserted denotes a register 

operand and selects the register field to be entered in the source queue. If negated, 

the free pointer is selected and an entry is allocated in the source list for the source 1 

operand. 

Aside from providing a pointer select, the register valid signal is latched as bit 4 of 

the source 1 pointer and, when negated, generates SRC1_N0T_REGISTER H. When 

asserted, SRCl_NOT_REGISTER_H initiates the process of incrementing the free 

pointer. 

To pass a valid source 1 pointer, IBOX_SRCl_VALID_H must be asserted. This signal 

is passed from the XBAR, and, if no stalls or flushes are present, is passed to the EBox. 

This signal enables the EBox to enter the source 1 pointer into the source queue. 

5.4.2 Source 2 Pointer 

The source 2 pointer and valid signals are generated in a similar manner as the source 1 
pointer. Figure 5-20 is a detailed block diagram of the source 2 pointer logic. 
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Figure 5-20 FPL Source 2 Pointer Logic 
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The pointer multiplexer of this logic provides the selection of three sources for the source 
2 pointer. 

• CURRENT_SOURCE2 JIEGISTER[03:00] — When source 2 is a register operand, 
the source 2 register field is passed. 

• FREE POINTER[03:00] — Selected when source 1 is a valid register and source 2 
is valid but not a register operand. This entry is submitted to the source list. 

• FREE_POINTER_PLUS_SOURCE1_SIZE[03K)0] — When both source 1 and 
source 2 are valid but not registers, the entry for the source list (free pointer) must 
be allocated after the source list entries for the source 1 operand. 

The select for the pointer multiplexer is provided by the source 1 and source 2 not register 
signals. 

5.4.3 Free Pointer 

As entries are added to the source list, the free pointer must be incremented to reflect the 
size of the entries. Figure 5-21 is a block diagram depicting the management of the free 
pointer. 
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Figure 5-21 Free Pointer 



DIGITAL INTERNAL USE ONLY 



5-30 Specifier Decode 



The opcode of each instruction is decoded in the access/data type logic, with the data type 
providing the specifier size for each valid memory operand. Register operands have no 
effect on the free pointer because they are not entered into the source list. 

When source 1 and source 2 pointers have both been passed as valid entries into the 
source list, the free pointer is equal to the size of the two operands (SRCl_PLUS_SRC2_ 
SIZE) plus the previous free pointer. 

If either source 1 or source 2 is a register operand, then only the size of the memory 
operand must be added to the free pointer to produce a new free pointer. When both 
source 1 and source 2 are register operands, the free pointer is not affected. 

5.4.3.1 Free Pointer Initialization 

On a flush of the free pointer logic, the EBox copies the free pointer into the last 
pointer in the source list, and the IBox increments the free pointer (FREE_POINTER_ 
PLUS1[03:00]) so that the source list appears empty. 

5.4.4 Destination Pointer 

The destination pointer is generated and validated in XDTA and passed to the FPL in 
OSQB. Three signals deliver the destination pointer information: 

• XDTA_DESTINATION_REG_H[03:00] is the register number of the destination 
operand when it is a register operand. 

• XDTA_DESTINATION_REG_VALID_H is asserted when the destination operand is a 
register. 

• XDTA_DESTINATION_VALID_H is asserted when a valid destination operand is 
being passed. 

Figure 5-22 shows a block diagram of the destination pointer logic. 

5.4.4.1 Destination Register 

The XBAR supplies the destination register field (XDTA_DESTINATION_REG[03:00]) 
and, when the register valid bit is set (XDTA_DESTINATION_REG_H), the operand is a 
register and the field is placed in the destination queue as the address of the GPR that 
receives the destination data. 

5.4.4.2 Destination Valid 

The destination valid signal (XDTA_DESTINATION_VALID) is passed by the XBAR. 
This signal validates the destination pointer if FPL_FLUSH, EBOX_QUE_FULL, or 
DISABLE_VALIDS are not asserted. 

5.4.4.3 Destination Memory 

To differentiate between a register destination and a memory destination, IBOX_ 
DESTINATION_MEMORY_H is used. When asserted, this signal signifies a memory 
destination is being passed. This signal is asserted when XDTA_DESTINATION_VALID_ 
H is asserted, XDTA_DESTINATION_REGISTER_H is negated, and no flush is present. 

All destination pointers are sent to the EBox destination pointer queue. 
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Figure 5-22 Destination Pointer Logic 
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5.5 Operand Control Unit 

The operand control (OCTL) unit provides control and distribution of stall and flush 
signals from the EBox and from other IBox functional units. OCTL also stores the read 
and write register masks that are generated for each instruction. Figure 5-23 is a basic 
block diagram of OCTL. 



EBOX_FLUSH_H|02:00) 




OCTL 


FLUSH 




XDTA_READ_MASK[14:00] 




PCU 

INSTRUCTION BUFFER 


XDTA_WRITE_MASK[14:00] 






VIC 
MBOX 


PCHI_UNWIND_H 




EBOX_READ_MASK|14:00) 






SCOREBOARD_STALL 


EBOX_WRITE_MASK[1 4:00) 






MASK_VALID 
















MR_X0117_89 



Figure 5-23 OCTL Unit 
5.5.1 Read/Write Masks 

Up to six read and write masks can be stored in OCTL. Each mask is passed by the 
XBAR when the instruction it represents has been completely decoded. The masks are 
generated to prevent the EBox and IBox from using stale data during the execution 
of instructions. Figure 5-24 is a detailed block diagram of the read and write mask 
generation, storage, and unwind logic. 

Each instruction that is decoded in the XBAR generates a 31-bit mask that is passed to 
the OCTL unit. This field is passed at the completion of the instruction decode with a 
valid bit (XDTB_MASK_VALID_H) and is broken down as follows: 

XDTA_MASK_H[30] is the odd parity for the mask field. 
XDTA_MASK_Hf29:15] is the write mask field. 
XDTA_MASK_H[ 14:00] is the read mask field. 

If the first instruction the IBox decoded were ADDL2 R0, Rl, the mask the XBAR 
generates would contain bits and 1 set in the read mask and bit 1 set in the write mask. 
This mask would be selected by REG_INSERT[02:00] to be stored in REG.MASKO. 

The following list describes the three basic operations that can be performed with the 
masks. The signals that initiate the function precede the descriptions. 

XDTB_MASK_VALID_H — Accept a new mask from the XBAR. 
EBOX_INSTRUCTION_DONE_H — Discard the oldest mask. 
EBOX_KEEP_MASKS[02:00] — Unwind the masks on a bad branch prediction. 

These three functions affect the following two fields: 

• REGISTER_SIZE_H[02:00] is the field that indicates the number of the masks that 
are valid. 

• REGISTER_INSERT_H[02:00] is the field that points to the position where the next 
mask should be inserted. 
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SCOREBOARD_STALl_TB_L 



MASK_VAUD_H 



REGJNSERT|02:001 



CORRECTION[02:00] 



EBOX_KEEP_MASKS 




MODULO 6 

SUBTRACT 



WRITE_MASK 
lNSTRUCT10N_DONE 
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SCAN 
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SCAN 
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EBOX_MASK PARITY oso . 



OCTL_EBOX_WRITE_MASK[1 4:00) 



KR.XB11I.8* 



Figure 5-24 OCTL Read/Write Masks 

5 511 Mask Valid 

XDTB_MASK_VALID H, asserted, directs REGISTER_SIZE and REGISTERJNSERT to 
be incremented. The mask field is then inserted into MASKx. Incrementing is done in 
modulo 6 (count = through 5). 

The six sets of registers that hold the masks are cyclic. That is, if the last mask was 
MASKS, the next mask to be used would be MASKO. 

Th^ 1 EBlTlt°the n compretion of an instruction, asserts INSTRUCTION J30NE.H This 

signal directs the mask logic to discard the mask associated with the instruction that was 

decoded. 

Discarding the mask is accomplished by decrementing (modulo .6) ^^STI^IZE 

H[02:00]. This function directs the oldest mask to be deleted while REGISTER_lNbfcJU_ 

H[02:00] still points to the next valid position for a mask to be written. 
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5.5.1.3 Correction 

In the event of a bad branch prediction, the EBox must direct the IBox masks to be 
unwound back to the point of the branch. The EBox directs the unwind operation by 
sending EBOX_KEEP_MASK_L[01:00] to the mask logic. Figure 5-25 shows the logic 
involved in the correction. 



REG_INSERT_H[02:00] 



CORRECTION_H[02:00J 



MODULO 6 
COUNTER 



REGISTER_SIZE_H[02:00] 



REG_INSEfiT_H[02:00J 



REGISTER 
VALID 



MASKx_VALID_H[29:001 



Figure 5-25 OCTL Mask Correction Logic 

NEW_REG_INSERT_H[02:00] is generated by subtracting CORRECTION_H[02:00] from 
REG_INSERT_H[02:00]. The subtraction is modulo 6. Table 5-3 describes the generation 
of NEW_REG_INSERT_H[02:00] with the possible combinations of REG_INSERT[02:00] 
and CORRECTION_H[02:00]. 

Table 5-3 Modulo 6 Subtraction Logic 





REG_INSERT[02:00] 












CORRECTION 














H[02:00] 


12 


3 


4 


5 


6 


7 





000 001 010 


Oil 


100 


101 


XXX 


XXX 


1 


101 000 001 


010 


on 


100 


XXX 


XXX 


2 


100 101 000 


001 


010 


Oil 


XXX 


XXX 


3 


Oil 100 101 


000 


001 


010 


XXX 


XXX 


4 


010 Oil 100 


101 


000 


001 


XXX 


XXX 


5 


001 010 Oil 


100 


101 


000 


XXX 


XXX 


6 


000 001 010 


Oil 


100. 


101 


XXX 


XXX 


7 


XXX XXX XXX 


XXX 


XXX 


XXX 


XXX 


XXX 



Legend 

0-7 = Input count for CORRECTION_H[02:00] or REG_INSERT[02:00] 
000-101 = NEW_REGJNSERT_H[O2:O0J binary output 
XXX = Not possible 
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NEW_REG_INSERT_H[02:00] is decoded with REGISTER_SIZE_H[02:00] to produce the 
register valid field. Table 5-4 provides the output of the decode logic for the register valid 
fields. 



Table 5-4 Register Valid Fields 



REGISTER_SIZE_H[02:00] 



NEW_REG_ 

I NSERT_H[02:00] 

000 

001 
010 
Oil 
100 
101 



000 



001 



010 



on 



100 



101 



110 



000000 


100000 


110000 


111000 


111100 


111110 


111111 


000000 


000001 


100001 


110001 


111001 


111101 


linn 


000000 


000010 


000011 


100011 


110011 


111011 


linn 


000000 


000100 


000110 


000111 


100111 


110111 


mill 


oooooo 


001000 


001100 


001110 


001111 


101111 


mill 


000000 


010000 


011000 


011100 


011110 


011111 


mm 



Legend 

000-101 = Input count for REGISTER_SIZEJI[02:00] or NEW_REG_INSERTJI[02:00] 

000000-111111 = REG5 valid through REGO valid output 

Where a 1 indicates valid and indicates invalid 

For example, 100000 = REG5 valid, 000011 = REG1 valid and REGO valid 



Using Table 5-4, if NEW_REGISTER_INSERT_H[02:00] were 001 and REGISTER. 
SIZE_H[02:00] were 010, REGISTER_VALID_H would equal 100001. This value 
represents REG5 and REGO as valid. 

5.5.2 Read/Write Mask Parity 

REG_MASK0[30:00] through REG_MASK5_H[30:00] are parity checked at their output of 
the OCTL MCA. An error detected within one of the masks asserts REG_MASK_PERR_0 
through REG_MASK_PERR_5, with through 5 corresponding to the register mask. 

REG_MASK_PERRx asserts OCTL.ERROR, which is passed to the OSQA MCA OSQA, 
on receipt of this error, asserts SPECIFIER_ERROR_H and forwards it to the EBox. 

5.5.3 Flushes 

The OCTL unit receives EBOX_FLUSH_H[02:00], decodes the field, and distributes flush 
signals to the appropriate functional units. Figure 5-26 shows the flush logic in the 
OCTL unit. 

The EBox redirects the flow of the IBox with a 3-bit flush code. Table 5-5 provides the 
breakdown of these flush fields. 
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Figure 5-26 OCTL Flush Logic 



Table 5-5 EBox Flush Codes 



Code 



Name 



Action 



010 


VBox mode 


011 


Unsuspend 


100 


IBUF flush 


110 


VIC flush 


111 


FPD flush 



lxx 



PC flush 



Directs the IBox to enter VBox mode. 

Directs the XBAR to resume decoding specifiers. 

Flushes the instruction buffer. 

Directs the PCU to flush the VIC. If, during the 256 cycles 
that it takes the VIC to complete the flush, a subsequent flush 
is received, the IBox stalls. 

First part done (FPD) flush is used by the EBox to stall the 
execution of lengthy instructions. The instruction, under 
EBox microcode control, is stalled during execution to service 
interrupts or exceptions. The XBAR receives this signal. 

Each flush directed to the IBox functional units is also copied 
to the PCU. This copy of the flush redirects the PCU to a new 
prefetch PC. 
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5.5.4 OCTL Stalls 

Two stalls are related to the read/write mask logic: scoreboard stall and mask stall. A 
scoreboard stall (SCOKEBOARD_STALL_H) occurs when instructions containing GPR 
conflicts are being processed. Scoreboard stalls inhibit updating the EBox GPRs until the 
instruction containing the conflicts is completely processed. Mask stall (OCTLJMASK 
STALLH) is asserted when the IBox is ahead of the EBox by six instructions. The stall 
inhibits the IBox fom processing more I-stream. 

5.5.4.1 Scoreboard Stall 

OSQA, in controlling the EBox interface, detects this stall and inform the mask logic. 
The stall directs the mask to be submitted again for a write to the EBox. 

5.5.4.2 Mask Stall 

Figure 5-27 is a detailed block diagram describing the mask stall logic. 

On power-up, the SPU loads a value of six (MASK_COUNT_H[02:00]) into one 
comparator and MASK_COUNT_H[02:00] minus one into a second comparator. Each 
time a new mask is loaded, REG_SIZE_H[02:00] is compared to MASK_COUNT_ 
H[0200] The two comparators used in this process output MASK_FULL_NEXT_ 
CYCLE.H when the count reaches five and MASK_CURRENTLY_FULL_H when the 
count reaches six. OCTL_MASK_STALL_H is asserted to inhibit passing read and write 
masks to the EBox when the IBox is six instructions ahead of the EBox. 
The value that is passed to the mask logic (MASK_COUNT_H[02:00]) is programmable. 
This value can be set at the console from one to six. Setting the value to one allows the 
IBox to get one instruction ahead of the EBox. 
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Figure 5-27 OCTL Mask Stall Logic 
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5.6 OPU Port Interface 

The OPU port interface is a read/write interface to the MBox. The interface is 32 bits 
wide and is byte parity protected. The interface is used for three reasons: 

Prefetching operands, from memory, for the EBox 

Queuing destination operand addresses for data from the EBox 

Fetching operands that are indirectly addressed 

Passing VBox requests to the MBox 

Figure 5-28 lists the signals that comprise the OPU port interface and Table 5-6 
describes them. 
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Figure 5-28 OPU Port Interface 
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Table 5-6 OPU Port Interface Signals 



Name 



IBOX_OP_REQUEST_H 

IBOX_OP_ADDRESS_H[31:00] 

mOX_OP_ADDRESS_PARITY_H[03:00] 

ffiOX_OP_CONTEXT_H[02:00] 



Description 



When asserted, indicates that an OPU port request 
is being made and that all other OPU port fields are 
valid. 

These lines are the 32-bit address sent to the MBox. 

Byte parity for the OPU address. 

Defines the reference size of the OPU port request. 



Field 


Reference Size 


000 


Longword 


001 


Byte 


010 


Word 


011 


Unused 


100 


Quadword 


101 


Octaword 


110 


Unused 


111 


Block (16 quadwords) 



IBOX OP_CONTROL_H[02:00] 



Provides the reference type of the transaction. 



Field 


Reference Type 


000 


Read flock status) 


001 


Read with write check (lock status) 


010 


Read with write check (no conflict check) 


011 


Write check flock status) 


100 


Read (don't lock status) 


101 


Read with write check (don't lock status) 


110 


Unused 


111 


Unused 
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Table 5-6 (Cont.) OPU Port Interface Signals 



Field 



Reference Size 



IBOX_OP_INDIRECT_Hf 01 :00] 



Indicates the final destination of the MBox reference. 



Field 



Destination 



00 Nonwrite reference for the EBox 

01 Write reference for the EBox 

10 Indirect OPU read for a nonwrite specifier 

11 Indirect OPU read for a write specifier 



IBOX_OP_TAG_H[03:00] 

IBOX OP CONTROL PARITY H 



Provides the address in the source list that the 
returning MBox data should be written. 

Odd parity for IBOX_OP_CONTEXT_H[02:00], IBOX_ 
OP_CONTROL_H[02:00], IBOX_OP_INDIRECT_ 
H[01:00], and IBOX_OP_TAG_H[03:00]. 



5.7 IBox-to-EBox Interface 

The IBox interface to the EBox is used to send the CSU and SLU operand data, control, 
and error information to the EBox. Figure 5-29 lists the signals that comprise the 
IBox-to-EBox interface and Table 5-7 describes them. 
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IBOX 


IBOX DATA H[31:16, 15:00] 




EBOX 


OPUB, A 
OPUB, A 

OSOA 

OSOA 

OSOA 

OSQA 

OSOB 

OSOB 

OSQB 

OSOB 

OSQB 

OSOB 

OSQB 

OSQB 

OSQB 

OSOB 

OSOB 

XSCA 

IBFB. A 

IBFB 

XSCA 

OSQA 
OSQA 
OSQA 
XDTA 
OSQA 
OSQA 
OSQA 
OSQA 
OSQA 
XSCA 
PCHI 

PCHI 

PCHI. LO 
PCVC, BP 

PCLO 
XSCA 
OSQB 
OSQB 
XSCA 
XDTA 




IBOX DATA PARITY H[03:02, 01:001 




IBOX DATA TAG H, 1(03:00] 




IBOX DST DATA TAG_H[03:00) 




IBOX DATA VALID H, L 




IBOX OSOA ISSA PARITY_H 




IBOX SOURCE1 POINTER_H, L[04:00] 




IBOX SOURCE1_VALID_H, L 




IBOX SOURCE2 POINTER_H, L[04:00] 


IBOX SOURCE2 VALID_H, L 




IBOX OSQB SRCS_PARITY_H 




IBOX DESTINATION MEMORY_H 




IBOX DESTINATION POINTER_H[03:00] 




IBOX DESTINATION_VALID_H 




IBOX OSOB ISSB PARITY_H 




IBOX OSQB QPTR PARITY H 




IBOX FREE POINTER_H[03:00] 




IBOX FORK H[08, 07:04, 03:00] 




IBOX FORK AODRESS_PARITY_H 




IBOX FORK_VALID_H 




IBOX GPR H|03:00] 




IBOX DST GPR_H[03:00] 




IBOX GPR_WRITE_H 




IBOX REGISTER FORK_H 




IBOX RLOG CONTEXT_H[03:00] 




IBOX_RLOG_TAG_H|01 :00] 




IBOX_RLOG_WRITE_H 




IBOX RLOG COMPLETER 




IBOX OSQA_RLOG_PARITY_H 




IBOX_INSTRUCTION_DECODED_H 
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IBOX_PREDICTION_H 




IBOX PC_H|31:24, 23:13. 120B. 07:00] 
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Figure 5-29 IBox-to-EBox Interface 
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Table 5-7 IBox-to-EBox Interface Signals 



Name 



Description 



IBOX_DATA_H[31:00] 

ffiOX_DATA_PARITY_H[03:00] 
IBOX_DATA_TAG_H, L[03:00] 

IBOX_DATA_VALID_H 

IBOX_DATA_ERROR_H 

IBOX_OSQA_ISSA_PARITY_H 

IBOX_SOURCEl_POINTER_H, L[04:00] 



IBOX_SOURCEl_VALID_H, L 
IBOX_SOURCE2_POINTER_H, L[04:00] 

IBOX_SOURCE2_VALID_H, L 

IBOX_OSQB_SRCS_PARITY_H 

IBOX_DESTINATION_MEMORY_H 

IBOX_DESTINATION_POINTER_ 
H[03:00] 

IBOX_DESTINATION_VALID_H 

IBOX_OSQB_ISSB_PARITY_H 

IBOX_OSQB_QPTR_PARITY_H 

IBOX_FREE_POINTER_H[03:00] 

IBOX_FORK_H[08:00] 

IBOXJFORK_PARITY_H 

IBOX_FORK_VALH)_H 

IBOX_GPR_H[03:00] 

IBOX_DST_GPR_H[ 03:00] 

IBOX_WRITE_H 

IBOX REGISTER_FORK_H 



This bus delivers the short literal or immediate 
data to the EBox source list. GPR updates, due to 
autoincrement and autodecrement specifiers, are also 
delivered to the EBox GRPs with this bus. 

Byte parity for IBOX_DATA_H[31:00]. 

Address in the source list that the IBOX_DATA is to 
be written to. 

When asserted, indicates that there is short literal or 
immediate data for the source list. 

Asserted to inform the EBox that a specifier error has 
been detected. 

Odd parity for IBOX_DATATAG_H[03:00] and 
IBOX_DATA_VALn). 

Provides the EBox source queue with the address, 
or the GPR of the source 1 operand. If the operand 
is SLU, CSU, or MBox data, the field provides the 
source list address for the operand. 

Validates the source 1 pointer. 

The same as the source 1 pointer, except it is the 
pointer for the source 2 operand. 

Valid bit for the source 2 pointer. 

Parity bit for source 1 and 2 pointers and valid bits. 

Asserted to inform the EBox that the destination 
operand is memory (not register). 

Provides the destination operand address. The 
address contains a GPR reference or write queue 
address in the MBox. 

When asserted, validates all destination signals. 

Parity bit for destination pointers and valid bit. 

Parity bit for source 1, source 2, and destination valid 
bits and pointers. 

Provides the current free pointer value in the source 
list. 

Copy of the opcode. 

Parity bit for IBOX_FORKH[08:00]. 

Asserted to indicate that IBOX_FORK_H[08:00] is 
valid. 

The address of the EBox GPR that is to be written. 
This copy of the address is used by the RLOG. 

Contains the address of the EBox GPR that is to be 
written. This copy of the address is for the STREG. 

Valid bit for IBOX_GPR_H and IBOX_DATA_H. 

Asserted to inform the EBox that a USRC specifier is 
being decoded. 
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Table 5-7 (Com.) IBox-to-EBox Interface Signals 



Name 



IBOX_RLOG_CONTEXT_H[03:00] 



Description 



Indicates the context for an RLOG request. The field 
supplies size and direction of changes. 



IBOX_RLOG_TAG_H[01:00] 



IBOX_RLOG_WRITE_H 

IBOX_RLOG_COMPLETE_H 

IBOX_OSQA_RLOG_PARITY_H 
IBOX_INSTRUCTION_DECDODED_H 

IBOX_CORRECTION_H 



ffiOX_PREDICTION_H 

IBOX_PC_H[31:00] 

ffiOX_PC_PARITY_H[03:00] 
IBOX_PC_VALID_H 



Field 


Change 


0000 


No change to GPR 


0001 


GPR incremented by one (byte) 


0010 


GPR incremented by two (word) 


0011 


GPR incremented by four (longword) 


0100 


GPR incremented by eight (quadword) 


0101 


GPR incremented by sixteen (octaword) 


0110 


Unused 


0111 


Unused 


1000 


Unused 


1001 


GPR decremented by one (byte) 


1010 


GPR decremented by two (word) 


1011 


GPR decremented by four (longword) 


1100 


GPR decremented by eight (quadword) 


1101 


GPR decremented by sixteen (octaword) 


1110 


Unused 


1111 


Unused 



No change is used when the IBox detects an IRC. 
This records the fact that the GPR was modified, but 
only in the IBox. 

Indicates for which instruction the RLOG information 
pertains. The field is matched with 3-bit counters 
that indicate which instruction is being executed in 
the EBox and IBox. 

Asserted to indicate that RLOG information is to 
be written and to validate all other RLOG-related 
signals. 

Asserted to inform the EBox that the CSU has 
stopped evaluating specifiers. 

Parity bit for the RLOG-related signals. 

Informs the EBox that the XBAR has completed 
decoding an instruction. 

Asserted by the PCU on a bad branch prediction. 
Informs the EBox that an unwind is not necessary, 
as the branch has not yet been shifted out of the 
instruction buffer. 

Asserted by the PCU to inform the EBox that the 
branch under decode is predicted taken. 

These lines deliver a copy of the decdode PC to the 
EBox. 

Byte parity for the EBox copy of the decode PC. 
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Table 5-7 (Com.) IBox-to-EBox Interface Signals 



Name Description 



IBOX_FORK_ERROR_H Asserted to inform the EBox that a fork error has 

been detected. 

D30X_POINTER_ERROR_H Asserted to inform the EBox of a detected pointer 

error. 

IBOX_IB_PAGE_FAULT_H Asserted to inform the EBox that the data being 

decoded page faulted in the MBox. The signal is not 
sent unless the data is accessed. 

IBOX_RAF_H The XBAR informs the EBox of an RAF by asserting 

this line. 



5.8 EBox-to-IBox Interface 

The EBox interface to the IBox is used to write result data to GPRs and to provide a 
variety of control functions for the IBox. The control signals can include starting or 
flushing PCs, RLOG unwinds, interrupts, and branch prediction status signals. 

Figure 5-30 lists the signals that comprise the EBox-to-IBox interface and Table 5-8 
describes them. 
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IBOX 



PCU 
OSQA 



OCTL 

STQ2 

STG2 

STG2, 3 

STG3 

OCTL 
XDTB 

OCTL 
OSOA 

OSOB 

OSOB 

STG3. 2 

PCHI, LO 
PCVC, BP 

STG3, 2 

PCHI, LO 
PCBP 

OSOB 

XDTB 
XDTA 



EBOX_BRANCH_A_H&L, B_H 



EBOX_BRANCH_VALID_A_H&L, B_H 



EBOX_FLUSH_H[02:00) 



EBOX_GPR_BYTE0_WRITE_H 



EBOX_GPR_BYTEl_WRITE_H 



EBOX_GPR_H[03:00] 



EBOX_GPR_WORD1_WRITE_H 



EBOX_KEEP_MASKS_H. L[01:00] 



EBOX_INSTRUCTION_DONE_H 



EBOX_INTERRUPT_H 



EBOX_LAST_POINTER_H[03:00] 



EBOX_OUEUE_FULL_H 



EBOX_RESULT_H[31 :16, 15:00] 
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EBOX 



EBOX_RESULT_L[31:24. 23:08. 07:05, 07:00] 



EBOX_RESULT_PARITY_H[03:02, 01:00] 



EBOX_RESULT_PARITY_L[03, 02:01, 00] 



EBOX_RLOG_FULL_H 



EBOX_RLOG_POINTER_HI01:00] 



EBOX UNSUSPEND_H 



Figure 5-30 EBox-to-IBox Interface 



UR_X0-.2<_e9 
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Table 5-8 EBox-to-IBox Interface Signals 



Name 



Description 



EBOX_BRANCH_A,B_H 
EBOX_BRANCH_VALID_H 

EBOX_FLUSH_H[02:00] 
EBOX_GPR_BYTE0_WRITE_H 

EBOX_GPR_BYTEl_WRITE_H 

EBOX_GPR_H[03:00] 

EBOX_GPR_WORDl_WRITE_H 

EBOX_KEEP_MASK_H[01:00] 

EBOX_INSTRUCTION_DONE_H 

EBOX_INTERRUPT_H 

EBOX_LAST_POINTER_H[03:00] 

EBOX_QUEUE_FULL_H 

EBOX_RESULT_H[31:00] 

EBOX_RESULT_PARITY_H[03:00] 
EBOX_RLOG_FULL_H 

EBOX_RLOG_POINTER_H[02:00] 

EBOX UNSUSPEND_H 



Asserted in the PCU and OSQA when a bad branch 
prediction is detected. 

Asserted to inform the IBox that a conditional branch 
has been retired. The CSU decrements the branch 
count if the branch was predicted correctly. 

This field passes EBox flush signals to the IBox. 

Asserted to indicate that the EBox wishes to write 
byte of the GPR pointed to by EBOX_GPR_H{03:00]. 

Asserted to indicate that the EBox wishes to write 
byte 1 of the GPR pointed to by EBOX_GPR_H[03:00]. 

These lines indicate which GPR the EBox will write 
to. 

Asserted to indicate the EBox wishes to write 
the high-order word that EBOX_GPR_H[03:00] is 
addressing. 

This field informs the IBox of how many register 
masks to keep on a bad branch prediction. The field 
reflects the number of instructions still in progress in 
the EBox. 

Asserted when an instruction is complete in the 
EBox. Directs the register mask logic to delete the 
oldest mask. 

Asserted when the EBox is taking an exception or 
an interrupt and directs the IBox to stop processing 
specifiers. 

This field provides a pointer to the last used location 
in the source list. 

Asserted when the EBox queues are full (except 
RLOG). 

These lines deliver the result data to be written to 
the IBox GPRs and is also the path used to supply a 
new PC to the IBox. 

Byte parity for the EBox result data. 

Asserted to inform the IBox that the EBox RLOG 
queue is full. 

This field is sent to XDTB and describes the current 
RLOG entry. 

This signal is sent to XSCA to negate XBAR_ 
SUSPEND_H. 
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5.9 VBox Interface 

The VBox interfaces to the MBox through the EBox and the IBox. The MBox prefetches 

the operand requested and the VBox, through the IBox STREGs and OPU port, requests 

the operands. The operand data is returned to the EBox source list. 

Figure 5-31 list the signals that comprise the VBox interface and Table 5-9 describes 

them. 



IBOX 


VBOX_ADDRESS_H[31:16, 15:00] 


VBOX 


STG3, 2 
ST63, 2 

OSQA 
OSQA 
OSQA 
OSQA 
OSQA 




VBOX_ADDRESS_PARITY_H[03:02, 01 :00] 


VBOX_ADDRESS_VALID_H 


VBOX_REFERENCE_SIZE_H 


VBOX_REFERENCE_TYPE_H 


VBOX_BLOCK_READ_H 


VBOX_READ_NOP_H 





Figure 5-31 VBox Interface 
Table 5-9 VBox Interface Signals 



Name 



VBOX_ADDRESS_H[31:02] 

VBOX_ADDRESS_PARITY_H[03 :( >0] 
VBOX_ADDRESS_VALID_H 

VBOX_BLOCK_READ_H 
VBOX_REFERENCE_SIZE_H 

VBOX_REFERENCE_TYPE_H 



MR.X0125_«9 



Description 



Address lines of the request the VBox is making to 
the MBox. Bits and 1 are always cleared. 

Byte parity for the VBox address. 

Asserted to indicate that the VBox is making an 
MBox request, across the OPU port. The signal 
validates all other VBox signals received by the IBox. 

Asserted when the VBox request is for a block of data 
(16 longwords). When asserted, it is assumed that a 
read request is being made. 

When asserted, indicates that the reference size of 
the VBox request is a quadword. When negated, the 
reference is a longword. 

When asserted, indicates that the VBox reference 
type is write. When negated, the reference type is 
read. 
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6 

IBox Error Descriptions 



This chapter describes the IBox error registers. Six registers report errors. The tables 
in this chapter contain a description of each error, including the register bit, the error 
pneumonic, the error signal that asserts the error bit, and a description of the failing 
data or control signals that cause the error. 

6.1 IBox Error Registers 

The six error registers that report IBox errors are as follows: 

IBOX_FETCH_ERROR_REGl_H[3 1:00] 
IBOX_FETCH_ERROR_REG2_H[3 1:00] 
IBOX.DECODE ERROR_REG1_H[31:00] 
IBOX_XBR_DECODE_ERROR_REGISTER_H[31:00] 
IB0X_SPECIFIER_REG1_H[3 1:00] 
IBOX_SPECIFIER_REG2_H[3 1:00] 

Each register corresponds to one of the three IBox pipeline stages: fetch, decode, or 
specifier. Data and control flow mainly from the fetch stage to the decode stage, then 
to the specifier stage in the IBox. Because of this relationship, if an error is detected in 
a previous stage, only that error is reported. Once an error is detected in the IBox, the 
state elements in the IBox are held until the console can scan in new values and restart 
the IBox. 

6.2 Fetch Error Register 1 

Two 32-bit registers report errors occurring in the fetch stage of the IBox pipeline. The 
errors reported are those detected in the following MOAs: 

PCBP 

PCVC 

PCLO 

PCHI 

IBFA 

IBFB 

Figure 6-1 shows the bit field breakdown of fetch error register 1. Table 6-1 describes 
each error that this register reports. 



DIGITAL INTERNAL USE ONLY 6-1 
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IBFA PARITY ERRORS 



VIC 
TAG 



PRE 
PC 



XDTB 
DISP 



XDTA 
DISP 



PCHI PARITY ERRORS 



PCBP 
DEC PC 



BP 
PRE- 
DICT 



BPDISP 
01 00 



PRE 
PC 



DEC 

PC 



PCLO PARITY ERRORS 



EBOX BP 

RE- I PRE 
SULT DICT 



PCBP 
DEC PC 



PCLO PARITY ERRORS 
PRE- 



DICT 
PC 



PRE 
PC 



DEC 
PC 



PCBP 
DEC PC 



PCVC PARITY ERRORS 

BP 
PRE- I BP PRE 

DICT TAG PC 



DEC 
PC 



PCBP PARITY ERRORS 



PCHI I PCLO | 
DEC PC DEC PC 



BP 
PRE- 
DICT 



I BP 
LENGTH 



BPTAQ 
01 00 



PRE 
PC 



DEC 
PC 



ufi_]c0i«_ag 



Figure 6-1 Fetch Error Register 1 
Table 6-1 Fetch Error Register 1 



Bits 



Error Name 



Description 



00 DEC PC (PCBP_DECODE_ 
PC_ERROR_H) 

01 PRE PC (PCBP_PREFETCH_ 
PC_EPvROR_H) 

03:02 BP TAG (PCBP_BP_PRED_ 

TAG_ERROR_H[03:02]) 



04 BP LENGTH (PCBP_BP_ 

INSTR_LENGTH_ERROR_H) 



05 BP PREDICT (PCBP_BP_ 
PREDICTION_ERROR_H) 

06 PCLO DEC PC (PCBP_PCLO_ 
DECODE_PC_ERROR_H) 

07 PCHI DEC PC (PCBP_PCHI_ 
DECODE_PC_ERROR_H) 

08 DEC PC (PCVC_DECODE_ 
PC_ERROR_H) 

09 PRE PC (PCVC_PREFETCH_ 
PC_ERROR_H) 

10 BP TAG (PCVC_BP_PRED_ 
TAG_ERROR_H) 



11 BP PRED (PCVCJBP. 
PREDICTION_ERROR_H) 

12 PCBP DEC PC (PCVC_PCBP_ 
DECODE PC ERROR_H) 



Occurs when a parity error is detected in data being 
latched in DECODE_PC_TA_H[07:00] in PCBP. 

Occurs when a parity error is detected in the data 
being latched in PREFETCH_PC_H[07:00] in PCBP. 

Occurs when a parity error is detected in bits 
[31:16] of the branch prediction cache tag field. 
Parity checking is performed in PCBP. 

Occurs when a parity error in the instruction length 
field is stored in the BP STRAMs on the VIC MCU. 
Parity checking is performed after the field is passed 
to PCBP. 

Occurs when the two prediction bits from the BP 
STRAMs are not the same. PCBP receives these 
bits from the VIC MCU. 

Occurs when a parity error is detected in PCLO_ 
DECODE_PC_TB_H[23:16]. PCBP receives this 
field from PCLO and performs the parity check. 

Occurs when a parity error is detected in PCHI_ 
DECODE_PC_TB_H[31:24]. PCBP receives this 
field from PCHI and performs the parity check. 

Occurs when a parity error is detected in the data 
being latched in DECODE_PC_TA_H[15:08]. 

Occurs when a parity error is detected in the data 
being latched in PREFETCH_PC_H[ 15:08] in PCVC. 

Occurs when a parity error is detected on bits 
[15:10] of the branch prediction tag. The parity 
check is performed when the STRAM data is passed 
to PCVC. 

Occurs when the BP prediction bits sent to PCBP 
are not the same. The bits are stored in the BP 
STRAMs of the VIC MCU. 

Occurs when a parity error is detected on data being 
latched in PCBP_DECODE_PC_TA_H[05:00]. 
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Table 6-1 (Cont.) Fetch Error Register 1 



Bits 



Error Name 



Description 



13 
14 
15 
16 

17 

18 

19 

20 

22:21 

23 

24 

25 

26 

29:27 
31:30 



DEC PC (PCLO_DECODE_ 
PC_ERROR_H) 

PRE PC (PCLO_PREFETCH_ 
PCERROR.H) 

PREDICT PC [15:13] (PCLO_ 
PRED_PC_15_13_ERROR_H) 

BP PREDICT (PCLO_BP_ 
PREDICTION_ERROR_H) 

EBOX RESULT (PCLO_ 

EBOX_RESULT_ERROR_ 

H[01J) 

PCBP DEC PC (PCLO_PCBP_ 
DECODE_PC_ERROR_H) 

DEC PC (PCHI_DECODE_PC. 
ERRORJH) 

PRE PC (PCHI_PREFETCH_ 
PC_ERROR_H) 

BP DISP (PCHI_BP_TAG_ 
DISP_ERROR_H[01:00]) 

BP PREDICT (PCHI_BP_ 
PREDICTION_ERROR_H) 

PCBP DEC PC (PCHI_PCBP_ 
DECODE_PC_ERROR_H) 

XDTA DISP (PCHI_XDTA_ 
DISP_ERROR_H) 

XDTB DISP (PCHI_XDTB_ 
DISP ERROR.H) 



PRE PC (IBFA_PREF_PC_ 
ERROR_H[03:01]) 



VIC TAG (IBFA_VICT_TAG_ 
ERROR_H[02]) 



Occurs when the data being latched in DECODE. 
PC_TA_H[23:16] causes a parity error in PCLO. 

Occurs when the data being latched in PREFETCH. 
PC_H[23:16] asserts a parity error in PCLO. 

Occurs when BPPC_PREDICTION_PC_B_H[15:13] 
from BPST asserts a parity error in PCLO. 

Occurs when the two BP prediction bits sent to 
PCBP are not the same. The bits are stored in the 
BP STRAMs in the VIC MCU. 

Occurs when EBOX_RESULT_H[15:08] from the 
EBox asserts a parity error in PCLO. 

Occurs when PCBP_DECODE_PC_TA_H[05:00] 
or PCBP_DECODE_PC_PARITY_TA_H asserts a 
parity error in PCBP. 

Occurs when the data being latched in DECODE. 
PC_TA_H[31:24] causes a parity error in PCHI. 

Occurs when the data being latched in PREFETCH. 
PC_H[31:24] asserts a parity error in PCHI. 

Occurs when BPTD_TAG_DISPLACEMENT_ 
H[ 15:00] asserts a parity error in PCHI. 

If this error occurs, the two prediction bits sent to 
PCBP from the BP STRAMs are not the same value. 

Occurs if a parity error is detected when checking 
PCBP_DECODE_PC_TA_H[05:00] on PCBP. 

Asserted in PCHI when a parity error is detected in 
XDTA_DISPLACEMENT_L[11:08, 03:00]. 

Asserted in PCHI when a parity error is detected 
when checking XDTB_DISPLACEMENT_L[15:12, 
07:04] and XDTB_DISPLACEMENT_PARITY_ 
H01]. 

Asserted in IBFA when a parity error :s detected 
in the following signals: PCBP_PREFETCH_PC_ 
H[04:00] from PCBP to IBFA PCHI_PREFETCH_ 
PC_H[31:24] from PCHI to IBFA and PCLO_ 
PREFETCH_PC_H[23:16] from PCLO to IBFA. 

Bit 30 of this error is asserted when VICT_TAG_ 
H[23:13] asserts a parity error. Bit 31 is set when 
VICT_TAG_H[31:24] asserts a parity error. Both 
VIC tag fields are checked in IBFA when they are 
passed from the VICT STRAMs to the XBR MCU. 
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6.3 Fetch Error Register 2 

Fetch error register 2 is 11 bits wide and reports errors occurring in IBFA and IBFB. 
Figure 6-2 shows the bit field breakdown of this register and Table 6-2 describes each 
error that this register reports. 



31 30 



27 26 



NOT 
USED 



IBFB PARITY ERRORS 



IBEX ERROR |07:00] 
05 04 03 



IBFA PARITV ERRORS 

VICB VICA . 
BLOCK BLOCK VIC 
VAL VAL OW VAL 



Figure 6-2 Fetch Error Register 2 
Table 6-2 Fetch Error Register 2 



Bits 



Error Name 



Description 



00 VIC QW VAL (IBFA_VIC_ 
QUADWORD_ERROR_H) 

01 VICA BLOCK VAL (IBFA_ 
VICA_BLOCKVALID_ 

ERRORH) 

02 VICB BLOCK VAL (IBFA_ 
VICB_BLOCK_VALID_ 
ERROR_H) 

10:03 IBEX ERROR (A_IBFB_IBEX_ 

ERROR_H[07:00]) 



Occurs when a parity check of VICQ_QUADWORD_ 
VALIDS_H[03:00] fails. 

Occurs when VICA_BLOCK_VALID_H[01:00] is 
parity checked after being passed from VICT to 
IBFAontheXBRMCU. 

Occurs when VICB_BLOCK_VALIDH[01:00] fails 
parity testing after being passed from VICT to IBFA 
on the XBR MCU. 

Occurs when IBEX_DATA_H[63:00] fails a parity 
test that is performed as the data is passed to IBUF. 
The failing data is held partly in IBFA, partly in 
IBFB. Each byte of IBEX data has one parity bit. 
Each byte of IBEX data is split into nibbles. The 
low nibbles are stored in IBFA and the high nibbles 
are stored in IBFB. 
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6.4 Decode Error Register 1 

Decode errors are reported in two registers. Decode error register 1 reports errors that 
occur on instruction buffer data that is passed to XBAR and on control signals that are 
passed from XBAR to the PCU (shift counts, shift opcode). Figure 6-3 shows the bit field 
breakdown of this register and Table 6-3 describes each error that this register reports. 



NOT USED 


15 


14 


13 


12 


11 10 09 OB 07 


06 


OS 


04 


03 


02 


01 


00 


NOT 
USED 


OS 


07 


06 


IBFB PARITY ERRORS 

IBUF ERROR [0B:00] 
05 04 03 02 01 


00 


IBFB 
SC PE 


IBFA 
DEC PE 


PCHI 
DEC PE 


PCLO 
DEC PE 


PCVC 
DEC PE 


PCBP 
DEC PE 



Figure 6-3 Decode Error Register 1 
Table 6-3 Decode Error Register 1 



Bits 



00 



01 



02 



03 



04 



05 



Error Name 



PCBP DEC PE (A_PCBP_ 
DECODE_ERROR_H) 



PCVC DEC PE (A_PCVC_ 
DECODE_ERROR_H) 



PCLO DEC PE (A_PCLO_ 
DECODE_ERROR_H) 



PCHI DEC PE (A_PCHI_ 
DECODE_ERROR_H) 



IBFA DEC PE (A_IBFA_ 
DECODE_ERROR_H) 



IBFB DEC PE (A_IBFB_SC_ 
ERROR_H) 



Description 



Occurs when an error is detected in XSCA_ 
SHIFTCOUNT_C_L[03:00], XSCA_SHIFTOPCODE_ 
C_L, or XSCA_PCBP_B_PARITY_H. Parity checking 
is performed on PCBP as these signals are passed 
from XBAR. 

Occurs when an error is detected in A_ 
XBAR_SHIFTCOUNT_H[03:00], A_XBAR_ 
SHIFTOPCODE_H, and A_XSCA_PCVC_B_ 
PARITY_H as they are passed to PCVC. 

Occurs when an error is detected in 
XSCA_SHIFTCOUNT_C_H[03:00], XSCA_ 
SHIFTOPCODE_C_H, and XSCA_PCLO_B_ 
PARITY_H as they are passed to PCLO. 

Occurs when an error is detected in 
XSCA_SHIFTCOUNT_B_H[03:00], XSCA_ 
SHIFTOPCODE_B_H, XSCA_DISPLACEMENT_ 
VALID_H, and XSCA_PCHI_B_PARITY_H when 
they are passed to PCHI. 

Occurs when an error is detected in 
XSCA_SHIFTCOUNT_A_H[03:00], XSCA_ 
SHIFTOPCODE_A_H, XSCA FD_SHIFTOPCODE_ 
H, and XSCA_IBFA_B_PARITY_H as they are 
passed to IBFA. 

Occurs when an error is detected in XSCA_ 
SHIFTCOUNT_A_L[03:00],XSCA_SHIFTOPCODE_ 

A_L, XSCA_FD_SHIFTOPCODE_L, XSCA_ 
EXTENDED.L, and XSCA_IBFB_B_PARITY_H 
as they are passed to IBFB. 
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Table 6-3 (Cont.) Decode Error Register 1 



Bits 



Error Name 



Description 



14:06 IBUF ERROR (A_ffiFB_ffiUF_ 

ERROR_H[08:00]) 



Covers the IBFA and DBFB data paths for IBUF 
data to XBAR. Detection of a parity error in any 
of the nine bytes of IBUF data as it is passed to 
XBAR asserts the respective bit in this signal. (For 
example, when a parity error is detected in byte 
of the IBUF data, byte of A_IBFB_IBUF_ERROR_ 
H[08:00] is asserted.) 



6.5 XBAR Decode Error Register 

The XBAR decode error register reports decode errors that occur, primarily, in the 
XBAR. Failing data passing between XSCA, XDTA, and XDTB asserts errors that are 
reported in this register. Control signals with sources external to XBAR are also reported 
(for example, OSQA_BRANCH_COUNT_H[01:00] and OSQA_SL_BUSY_STALL_H). 
Figure 6-4 shows the bit field breakdown of this register and Table 6-4 describes each 
error that this register reports. 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 



NOT USED 

mmmmi 1_ 



I fa 



15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 



NOT 
USED 



XSCA PARITY ERRORS 



OSOA XDTA I IBFB IBFA 

PE PE I PE PE 



NOT 
USED 



XDTB PARITY ERRORS 



XSCA IBFB 

PE PE 



IBFA 
PE 



NOT 
USED 



XDTA PARITY ERRORS 



XSCA IBFB IBFA 

PE PE PE 



Figure 6-4 XBAR Decode Error Register 
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Table 6-4 XBAR Decode Error Register 



Bits 



Error Name 



Description 



00 



01 



02 



IBFA PE (EBFA_XDTA_ 
PARITY_ERROR_H) 



IBPB PE (IBFB_XDTA_ 
PARITY_ERROR_H) 



XSCA PE (XSCA_XDTA_ 
PARITY_ERROR_H) 



Occurs when an error is detected in IBFA_YREG_ 
F_A_L[03:01], IBFA_VALID_A_L[03:01], or ffiFA_ 
DATA_A_L[67:64, 59:56, 51:48, 43:40, 35:32, 27:24, 
19:16, 11:08] as they are passed from IBFA to XBAR. 

Occurs when an error is detected in IBFB_DATA_A_ 
L[39:36, 31:28, 23:20, 15:12] or IBFB_REGISTER_ 
MODE_L[08:05] as they are passed from IBFB to 
XBAR. 

Occurs when an error is detected on the interconnect 
between XSCA and XDTA The signals that can 
cause this error are as follows: 

XSCA_IMPLIED_MASK_H 

XSCA_IRC_L 

XSCA_REQUEST H[03:00] 

XSCA_SPECIFIERS_DECODED_H[01 :00] 

XSCA_SP1_ACCESS_H[09:00] 

XSCA_SP1_DATATYPE_H[02:00] 

XSCA_SP2_ACCESS_H[06:00] 
XSCA_SP2_DATATYPE_H[02:00] 
XSCA_SP3_ACCESS_H[01:00] 
XSCA_SP3_DATATYPE_H[02:01] 
XSCA X8F H 



04 



05 



06 



IBFA PE (IBFA_XDTB_ 
PARITY_ERROR_H) 



IBFB PR (IBFB_XDTB 
PARITY_ERROR_H) 



XSCA PE (XSCA_XDTB_ 
PARITY_ERROR_H) 



08 



IBFA PE (IBFA_XSCA_ 
PARITY_ERROR_H) 



Occurs when an error is detected in IBFA_DATA_A_ 
H[03:00] or IBFA_YREG_F_A_H[04:01] as they are 
passed from IBFA to XDTB. 

Occurs when a parity error is detected in IBFB_ 
DATA_A_H[71:68, 63:60, 55:52, 47:44, 39:36, 31:28, 
23:20, 15:12, 07:04] after it is passed to XDTB. 

Occurs when an error is detected on the interconnect 
between XSCA and XDTB. An error detected in one 
or more of the following signals can assert this 
error: 

XSCA_REQUEST_H[03:00] 

XSCA_IRC_L 

XSCA_X8F_H 

XSCA_SP1_DATATYPE_H[02:00] 

XSCA_SP2_DATATYPE_H[02:00] 

XSCA_SP1_DATATYPE_H[02, 00] 

XSCA_SPECIFEERS_REMAINING_L[02:01] 

XSCA_SPECIFIERS_DECODED_H[01:00] 

Occurs when an error is detected in one or more of 
the following signals: 

IBFA_DATA_B_H[03:00] 
IBFA_VALID_H[08:00] 
IBFA_YREG_F_B_H[04:01] 
ffiFA_IB_PAGE_FAULTH 
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Table 6-4 (Com.) XBAR Decode Error Register 



Bits 



Error Name 



09 IBFB PE (IBFB_XSCA_ 

PARITY_ERROR_H) 



Description 



Occurs when an error is detected on the interconnect 
between IBFB and XSCA The following signals can 
assert this signal: 

ffiFB_DATA_B_H[39:36, 31:28, 23:20, 15:12, 

07:04] 
IBFB_REGISTER_MODE_H[08:05] 

ffiFB_SL_MODE_H[07:05] 
IBFB_UNCONDITIONAL_B_H 



10 XDTA PE (XDTA_XSCA_ 
PARITYJERRORJH) 

11 OSQA PE (OSQA_XSCA_ 
PARITY_ERROR_H) 



Occurs when an error is detected in XDTA_IRC_ 
MASK_H[08:00] as it is passed from XDTA to 
XSCA This error may propagate to a specifier error. 

Occurs when OSQA_DECODE_STALL_H, OSQA. 
BRANCH_COUNT_H[01:00],OSQA_SL_BUSY_ 
STALLH, or OSQA_IN_SEQUENCE_H asserts a 
parity error in XSCA This error may propagate to a 
specifier error. 



6.6 Specifier Error Register 1 

Specifier error register 1 reports errors that occur when data is being passed from XBAR 
to OSQA, OSQB, and OCTL. Errors occurring with the control signals passed to the 
OPU MCU and the data paths for the FPL and SLU are reported through this register. 
Figure 6-5 shows the bit field breakdown of this register and Table 6-5 describes the 
errors that this register reports. 



31 30 



25 28 27 26 25 24 23 22 21 2° ^ 18 1 ? ' s 



_j ^ 



16 14 13 1 2 11 10 09 08 07 06 05 04 03 02 01 00 





OCTL PARITY ERRORS 


OSOB PARITY ERRORS 




OSOA PARITY ERRORS 


NOT 
USED 

i 


REG MASK PE [05:00] 


OP- 
CODE 
PE 


XDTBPE 
01 00 


XOTA 
PE 


NOT 
USED 


MASK 
PE 


XDTB | XDTA 
PE | PE 



Figure 6-5 Specifier Error Register 1 
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Table 6-5 Specifier Error Register 1 



Bits 



Error Name 



Description 



00 XDTA PE (XDTA_OSQA_ 

PERR_TA_H) 



01 



02 



04 



XDTB PE (XDTB_OSQA_ 
PERRJAH) 



MASK PE (OSQA_MASK_ 
PERR_TA_H) 



XDTA PE (XDTA_OSQB_ 
PERR_TA_H) 



Asserted by detection of a parity error in XDTA_ 
XREG_H[03:00], XDTA_OSQA_A_PARITY_H, or 
XDTA_YREG_H[03:00] after they are passed from 
XDTA to OSQA 

Occurs when a parity error is detected in XDTB_ 
ORDERH, XDTB_SLVALID_H, XDTB_OSQAJB_ 
PARITY H, XDTB INDEXED_H, XDTB_MODE_ 
H[03:00], or XDTB_RLOG_TAG_H[02:00] in OSQA 

Occurs when a parity error is detected when 
checking OCTL_EBOX_READ_MASK_H[14:00], 
OCTL_EBOX_WRITE_MASK_H[14:00], and OCTL_ 
EBOX_MASK_PARITY_H. 

Asserted by a parity error on the interconnect 
between XDTA and OSQB. An error in one or more 
of the following signals asserts this error: 

XDTA_SL_H[03:00] 

XDTA_DESTINATION_REG_H[03:00] 

XDTA_SOURCE1_REG_H[03:00] 

XDTA_SOURCE2_REG_H[03:00] 

XDTA_DESTINATION_REG_VALID_H 

XDTA_DESTINATION_VALID_H 

XDTA_SOURCEl_REG_VALID_H 

XDTA_SOURCEl_VALID_H 

XDTA_SOURCE2_REG_VALID_H 

XDTA_SOURCE2_VALID H 

XDTA_OSQB_B_PARITY_H 



05 XDTB PE (XDTB_OSQB_A_ 

PERRJEAH) 



06 XDTB PE (XDTB_OSQB_B 

PERR_TA_H) 



07 OPCODE PE (OPCODE, 

PERR_TA_H) 



08 REG MASK PE [00] (REG 

MASK_PERR_TA_H[00]) 



09 REG MASK PE [01] (REG_ 

MASK_PERR_TA_H[01]) 



Asserted by a parity error in XDTB_OPU_ 
SPEdFIERS_COMPLETED_H[02:00],XDTB_ 
RSL_H[02:01], and XDTB_OSQB_A_PARITY_H. 
Parity checking is performed in OSQB when these 
signals are passed from XDTB. 

Asserted by a parity error in XDTB_SL_ 
SPECIFIER_NUMBER_H[02:00], XDTB_ORDER_ 
H, XDTB_SL_VALID_H, XDTB_SL_H[05:04], and 
XDTB_OSQB_B_PARITY_H. Parity checking is 
performed in OSQB as these signals are passed 
from XDTB. 

Asserted by a parity error in IBFA_OPCODE_A_ 
L[03:00], IBFB_OPCODE_L[07:04], and IBFB_ 
OPCODE JPARITY_H. Parity checking is performed 
in OSQB when these signals are received from FBFB 
and ffiFA. 

Occurs when an error is detected in register 
read/write mask in OCTL (REG_MASK0_TA_ 
H[30:00]). Each mask (mask through 5) is parity 
checked in OCTL after being passed from XDTA. 

Occurs when an error is detected in register 
read/write mask 1 in OCTL (REG_MASK1_TA_ 
H[30:00]). 
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Table 6-5 (Cont.) Specifier Error Register 1 



Bits 



Error Name 



10 REG MASK PE [02] (REG_ 
MASK_PERR_TA_H[02J) 

11 REG MASK PE [03] (REG_ 
MASK_PERR_TA_H[03]) 

12 REG MASK PE [04] (REG. 
MASK_PERR_TA_H[04]) 

13 REG MASK PE [05] (REG_ 
MASK_PERR_TA_H[05]) 



Description 



Occurs when an error is detected in register 
read/write mask 2 in OCTL (REG_MASK2_TA_ 
H[30:00]). 

Occurs when an error is detected in register 
read/write mask 3 in OCTL (REG_MASK3_TA_ 
H[30:00]). 

Occurs when an error is detected in register 
read/write mask 4 in OCTL (REG_MASK4_TA_ 
H[30:00]). 

Occurs when an error is detected in register 
read/write mask 5 in OCTL (REG_MASK5_TA_ 
H[30:00]). 



6.7 Specifier Error Register 2 

Specifier error register 2 is 26 bits wide and reports errors detected in the OPUA and 
OPUB MCAs. Figure 6-6 shows the bit field breakdown of this register and Table 6-6 
describes each error that this register reports. 



NOT 
USED 



EXPAN 
SION 



OPUB PARITY ERRORS 



OSOB 
SL OPUB AMUX , BMUX DEC PC 



CTRL : PE [03:02] 
01 , 00 



PE [03:02] 
01 



[04:03] 



SPEC 
I- STREAM 



NOT 
USED 



EXPANSION 

01 00 



OPUA 
CTRL 



AMUX 
PE [01:00] 
01 00 



OPUA PARITY ERRORS 

BMUX , 

PE [01:00] 
01 , 00 02 



•DEC PC 
[02:00] 



SPEC DEC 

l-STREAM DELTA 



Figure 6-6 Specifier Error Register 2 
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Table 6-6 Specifier Error Register 2 



Bits 



Error Name 



Description 



00 DEC DELTA (SPEC_ 
DECODE_DELTA_PERR_ 

TA_H) 

01 SPEC I-STREAM (SPEC. 
ISTREAM.LWORD.PERR. 
TA_H[00]) 

02 SPEC I-STREAM (SPEC. 
ISTREAM.LWORD.PERR. 
TA_H[01]) 

03 DEC PC [00] (SPECIFIER. 
DECODE_PC_PERR_TA_ 
H[00]) 

04 DEC PC [01] (SPECIFIER. 
DECODE.PC PERR.TA. 
H[01]) 

05 DEC PC [02] (SPECIFIER. 
DECODE.PC_PERR.TA_ 

H[02]) 

06 BMUX PE [00] (BMUX.PERR. 
TA_H[00]) 

07 BMUX PE [01] (BMUX.PERR. 
TA_H[01]) 

08 AMUX PE [00] (AMUX.PERR. 
TA_H[00]) 

09 AMUX PE [01] (AMUX.PERR. 
TA_H[01]) 

10 OPUA CTRL (OPUA. 
CONTROL_PERR_TA_H) 



11 . SL EXPANSION (SL_ 

EXPANSION.PERR.TA. 
H[00]) 

12 SL EXPANSION (SL_ 
EXPANSION_PERR_TA_ 
H[01]) 

16 SPEC I-STREAM (SPEC_ 
ISTREAM_LWORD_PERR_ 
TA_H[02]) 

17 SPEC I-STREAM (SPEC_ 
ISTREAM.LWORD.PERR. 

TA_H[03]) 



Occurs when a parity check on XDTB_DECODE_ 
DELTA_H[05:00] and XDTB_DECODE_DELTA_ 
PARITY.H fails. 

Occurs when a parity error is detected when 
testing XDTA_DISPLACEMENT_H[ 11:08], 
XDTA_DISPLACEMENT_H[03:00], and XDTA_ 
DISPLACEMENT_PARITY_H[00]. 

Occurs when a parity error is detected when 
testing XDTB_DISPLACEMENT_H[ 15:12], 
XDTB_DISPLACEMENT_H[07:04], and XDTB. 
DISPLACEMENT_PARITY_H[01]. 

Occurs when a parity error is detected when testing 
PCBP_OPU_DECODE_PC_H[07:00] and PCBP. 
OPU_DECODE_PC_PARITY_H[00]. 

Occurs when a parity error is detected while testing 
PCVC_OPU_DECODE_PC_H[12:08] and PCVC. 
OPU_DECODE_PC_PARITY_H. 

Occurs when a parity error is detected while testing 
PCLO_OPU_DECODE_PC_H[15:l3] and PCLO. 
OPU_DECODE_PC_PARITY_H[01]. 

Asserted by an error detected in the low BMUX data 
byte in OPUA (BMUX_H[07:00]). 

Asserted by an error in the high BMUX data byte in 
OPUA (BMUX_H[15:08]). 

Asserted by an error in the low AMUX data byte in 
OPUA (AMUX_H[07:00]). 

Asserted by an error in the high AMUX data byte in 
OPUA (AMUX_H[15:08]). 

Occurs when a parity error is detected in OSQA_ 
AMUX_SEL_H[02:00], OSQA_BMUX_SEL_ 
H[02:00], OSQA.CSHFT SELH[02:00], OSQA_ 
OP_GRANT_WAIT_H, OSQA_OPU_SEQ_STARTH, 
or OSQA_OPUX_CONTROL_PARITY_H. Parity 
checking is performed in OPUA 

Occurs when an error is detected while testing 
OSQB_SL_EXPANSION_H[07:00] and OSQB_SL_ 
EXPANSION_PARITY_H[00]. 

Occurs when an error is detected while 
testing OSQB_SL_EXPANSION_H[14], OSQB_ 
SL_EXPANSION_H[09:08], and OSQB.SL. 
EXPANSION_PARITY_H[01]. 

Occurs when an error is detected while testing 
XDTA_DISPLACEMENT_H[27:24], XDTA. 
DISPLACEMENT_H[19:16], and XDTA. 
DISPLACEMENT_PARITY_H[02]. 

Occurs when an error is detected while testing 
XDTB_DISPLACEMENT_H[31:28], XDTB. 
DISPLACEMENT_H[23:20], and XDTB. 
DISPLACEMENT_PARITy_H[03]. 
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Table 6-6 (Cont.) Specifier Error Register 2 



Bits 



Error Name 



Description 



18 DEC PC [03] (SPECIFIER, 
DECODE_PC_PEER_TA_ 

H[02]) 

19 DEC PC [04] (SPECIFIER. 
DECODE_PC_PERR_TA_H) 

20 BMUX PE [02] (BMUX_PERR_ 
TA_H[02]) 

21 BMUX PE [03] (BMUX_PERR_ 
TA_H[03]) 

22 AMUX PE [02] (AMUX_PERR_ 
TA_H[02]) 

23 AMUX PE [03] (AMUX_PERR_ 
TA_H[03]) 

24 OPUB CTRL (OPUB_ 
CONTROL_PERR_TA_H) 



Occurs when an error is detected while testing 
PCLO_OPU_DECODE PC_H[23:16] and PCLO_ 
OPU_DECODE_PC_PARITY_H[02]. 

Occurs when an error is detected while testing 
PCHI_OPU_DECODE_PC_H[31:24] and PCHI_ 
OPU_DECODE_PC_PARITY_H[03]. 

Asserted by an error detected in the low BMUX data 
byte in OPUB (BMUX_H[23:16]). 

Asserted by an error detected in the high BMUX 
data byte in OPUB (BMUX_H[31:24]). 

Asserted by an error detected in the low AMUX data 
byte in OPUB (AMUX_H[23:16]). 

Asserted by an error detected in the high AMUX 
data byte in OPUB (AMUX_H[31:24). 

Occurs when an error is detected in one or more of 
the following signals: 

OSQA_AMUX_SEL_L[02:00] 

OSQA_BMUX_SEL_L[02:00] 

OSQA_CSHFT_SEL_L[02:00] 

OSQA_OP_GRANT_WAIT_H 

OSQA_OPU_SEQ_START_H 

OSQA_OPUX_CONTROL_PARITY_H 



25 OSQB SL EXPANSION (SL_ 

EXPANSION_PERR_TA_ 
H[03]) 



Occurs when an error is detected while testing 
OSQB_SL_EXPANSION_H[31:29] and OSQB_SL_ 
EXPANSION_PARITY_H[03]. 



6-12 
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Tables A-l, A-2, and A-3 list the input and output signals of the IBox. The signals are 
grouped by their MCU and MCA origination or destination and are listed alphabetically. 
All signals represent communication between the IBox and the VBox, MBox, and EBox. 

The box origination of input signals is defined in the prefix of the signal (for example, 
EBOX_BRANCH_B_H[00] is an input from the EBox). 



Table A-1 IBox-VIC Signals 



Input 



Destination Origination 



EBOX_BRANCH_A_H[00] 

EBOX_BRANCH_A_H[00] 

EBOX_BRANCH_VALID_A_H[00] 

EBOX_BRANCH_VALID_A_H[OOJ 

EBOX_RESULT_L[07:00] 

EBOX_RESULT_L[07:05] 

EBOX_RESULT_PARITY_L[03:00] 

MBOX_IB_DATA_H[03:00] 

MBOX_IB_DATA_H[07:04] 

MBOX_IB_DATA_H[ 1 1:08] 

MBOX_IB_DATA_H[15:12] 

MBOX_IB_DATA_H[19:16] 

MBOX_IB_DATA_H[23:20] 
MBOX_IB_DATA_H[27:24] 
MBOX_IB_DATA_H[3 1:28] 
MBOX_IB_DATA_H[35:32] 
MBOX_IB_DATA_H[39:36] 
MBOX_IB_DATA_H[43:40] 

MBOX_IB_DATA_H[47:44] 
MBOX_IB_DATA_H[51:48] 
MBOX_IB_DATA_H[55:52] 
MBOX_IB_DATA_H[59:56] 
MBOX_IB_DATA_H[63:60] 
MBOX_IB_DATA_PARITY_H[01:00] 

MBOX_IB_DATA_PARITY_H[03:02] 



VIC-PCBP 


USQ-USQC 


VIC-PCVC 


USQ-USQC 


VIC-PCBP 


USQ-USQC 


VIC-PCVC 


USQ-USQC 


VIC-PCBP 


MUL-RETO 


VIC-PCVC 


MUL-RETO 


VIC-PCBP 


MUL-RET0+RET1 


VIC-CL#4 


DTA-DTMO 


VIC-CL#7 


DTA-DTM1 


VIC-CL#4 


DTA-DTM1 


VIC-CL#4 


DTA-DTM1 


VIC-CL#4 


DTA-DTM2 


VIC-CL#7 


DTA-DTM2 


VIC-CL#4 


DTA-DTM3 


VIC-CL#7 


DTA-DTM3 


VIC-CL#4 


DTA-DTMO 


VIC-CL#7 


DTA-DTMO 


VIC-CL#4 


DTA-DTM1 


VIC-CL#7 


DTA-DTM1 


VIC-CU4 


DTA-DTM2 


VIC-CL#7 


DTA-DTM2 


VIC-CL#4 


DTA-DTM3 


VIC-CL#7 


DTA-DTM3 


VIC-CL#7 


DTB-DTMO, DTM1 


VIC-CL#7 


DTA-DTM2, DTM3 
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Table A-1 (Cont.) IBox-VIC Signals 



Input 



MBOX_IB_DATA_PARITY_H[05:04] 

MBOX_IB_DATA_PARITY_H[07:06] 

MBOX_IB_PAGE_FAULT_L[00] 

MBOX_IB_HESPONSE_H[00] 

MBOX_IB_RESPONSE_TA_H[00] 



Output 



IBOX_ABORT_H[00] 

IBOX_ABORT_L[00] 

IBOX_IB_ABORT_H[00] 

IBOX_IB_ABORT_L[O0] 

IBOX_IB_ADDRESS_H[05:00] 

IBOX_IB_ADDRESS_H[ 12:06] 

IBOX_IB_ADDRESS_H[31:13] 

IBOX_IB_ADDRESS_PARITY_H[00] 

IBOX_IB_ADDRESS_PARITY_H[01] 

IBOX_PC_H[05:00] 

IBOX_PC_H[ 12:06] 

IBOX_PC_PARITY_H[00] 

IBOX_PC_PARITYJI[01] 
IBOX_CORRECTION_H[00] 



Destination Origination 



VIC-CL#7 

VIC-CL#7 

VIC-PCVC 

VIC-PCVC 

VIC-PCBP 



DTB-DTMO, DTM1 
DTA-DTM2, DTM3 
VAP-FALT 
CTU-CTMV 
CTU-CTMV 



Origination Destination 



VIC-PCVC 
VIC-PCVC 
VIC-PCVC 
VIC-PCVC 
VIC-PCBP 
VIC-PCVC 

VIC-PCVC 
VIC-PCBP 
VIC-PCVC 
VIC-PCBP 
VIC-PCVC 
VIC-PCBP 

VIC-PCVC 
XBR-PCHI 



MBOX-VAP-VAPO 
MBOX-VAP-CCSQ 
MBOX-VAP-VAPO 
MBOX-VAP-CCSQ 
MBOX-VAP-VAPO 
MBOX-VAP-VAPO 

MBOX-VAP-FXUP 
MBOX-VAP-VAPO 
MBOX-VAP-VAPO 
EBOX-CTL-QPCS 
EBOX-CTL-QPCS 
EBOX-CTL-QPCS 

EBOX-CTL-QPCS 
EBOX-INT-USQC 



Table A-2 IBox-XBR Signals 



Input 



EBOX_BRANCH_A_L[00] 

EBOX_BRANCH_A_L[00] 

EBOX_BRANCH_VALID_A_L[00] 

EBOX_BRANCH_VALID_A_L[00] 

EBOX_KEEP_MASKS_L[0 1:00] 

EBOX_RESULT_L[23K)8] 

EBOX_RESULT_L[31:24] 

EBOX_RESULT_PARITY_L[02:01] 

EBOX_RESULT_PARITY_L[03] 

EBOX_RLOG_POINTER_H[02:00] 

EBOX_UNSUSPEND_H[00] 

MBOX_IB_PAGE_FAULT_H[00] 

MBOX_IB_RESPONSE_TA_L[00] 
MBOX_IB_RESPONSE_TA_L[00] 
MBOX_IB_RESPONSE_TA_L[00] 



Destination Origination 



XBR-PCHI 
XBR-PCLO 
XBR-PCHI 
XBR-PCLO 
XBR-XDTB 
XBR-PCLO 

XBR-PCHI 
XBR-PCLO 
XBR-PCHI 
XBR-XDTB 
XBR-XDTA 
XBR-IBFA 

XBR-IBFA 
XBR-PCHI 
XBR-PCLO 



INT-USQC 
INT-USQC 
INT-USQC 
INT-USQC 
CTL-ISSA 
MUL-RETO 

MUL-RET1 

MUL-RETO 

MUL-RET1 

DST-SRCS 

INT-USQA 

VAP-FALT 

CTU-CTMV 
CTU-CTMV 
CTU-CTMV 
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Table A-2 (Com.) IBox-XBR Signals 



Output 



Origination Destination 



IBOX_FORK_ADDRESS_H[03:00] XBR-IBFA 

IBOX_FORK_ADDRESS_H[07:04] XBR-IBFB 

IBOX_FORK_ADDRESSJI[08] XBR-XSCA 

ffiOX_FORK_ADDRESS_PARITY_H[00] XBR-IBFB 

IBOX_FORK_ERROR_H[00] XBR-XSCA 

IBOX_FORK_VALID_H[00] XBR-XSCA 



EBOX-CTL-FRAMX 

EBOX-CTL-FRAMX 

EBOX-CTL-FRAMX 

EBOX-CTL-QPTR 

EBOX-CTL-ISSE 

EBOX-CTL-QTPR 



IBOX_IB_ADDRESS_H[21:13] XBR-PCLO 

IBOX_IB_ADDRESS_H[31:22] XBR-PCHI 

IBOX_IB_ADDRESS_PARITY_H[02] XBR-PCLO 

IBOX_IB_ADDRESS_PARrTY_H[03] XBR-PCHI 

IBOX_IB_PAGE_FAULT_H[00] XBR-XSCA 

IBOX_IB_REQUEST_H[00] XBR-IBFA 



MBOX-VAP-FXUP 

MBOX-VAP-FXUP 

MBOX-VAP-FXUP 

MBOX-VAP-FXUP 

EBOX-CTL-ISSA, ISSB, ISSC 

MBOX-VAP-VAPO 



IBOX_INSTRUCTION_DECODED_H[00] XBR-XSCA 

IBOX_PC_H[21:13] XBR-PCLO 

IBOX_PC_H[31:22] XBR-PCHI 

IBOX_PC_PARITY_H[02] XBR-PCLO 

IBOX_PC_PARITY_H[03] XBR-PCHI 

IBOX_PC_VALID_H[00] XBR-PCLO 



EBOX-CTL-ISSA, ISSB, ISSC 

EBOX-CTL-QPCS 

EBOX-CTL-QPCS 

EBOX-CTL-QPCS 

EBOX-CTL-QPCS 

EBOX-CTL-QPCS 



IBOX_PREDICTION_H[00] 

IBOX_RAF_H[00] 

IBOX_REGISTER_FORK_H[00] 



XBR-PCHI EBOX-CTL-QPTR 

XBR-XDTA EBOX-CTL-ISSA, ISSB, ISSC 

XBR-XDTA EBOX-CTL-QTPR 
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Table A-3 IBox-OPU Signals 



Input 



Destination 



Origination 



EBOX_BRANCH_B_H[00] 

EBOX_BRANCH_VALID_B_H[00] 

EBOX_FLUSH_H[02:00] 

EBOX_GPR_BYTE0_WRITE_Ht00] 

EBOX_GPR_BYTE1_WRITEJI[00] 

EBOX_GPR_H[03:00] 

EBOX_GPRJI[03:00] 

EBOX_GPR_WORD1_WRITE_H[00] 

EBOX_INSTRUCTION_DONE_H[00] 

EBOXJNTERRUPT_H[00] 

EBOX_KEEP_MASKS_H[01:00] 

EBOX_LAST_POlNTER_H[03:00] 

EBOX_QUEUE_FULL_H[00] 

EBOX_RESULT_H[15:0O] 

EBOX_RESULT_H[31:16] 

EBOX_RESULT_PARITY_H[01:00] 

EBOX_RESULT_PARITY_H[03:02] 

EBOX_RLOG_FULLJH[00J 

MBOX_OP_DATA_H[ 15:00] 

MBOX_OP_DATA_H[31:16] 

MBOX_OP_DATA_PARITY_H[01:00] 

MBOX_OP_DATA_PARITY_H[03:02] 

MBOX_OP_GRANT_H[00] 

MBOX_OP_RESPONSE_H[00] 

VBOX_ADDRESS_H[ 15:00] 

VBOX_ADDRESS_H[31:16] 

VBOX_ADDRESS_PARITY_H[01:00] 

VBOXjVDDRESS_PARITY_H[ 03:02] 

VBOX_ADDRESS_VALID_H[00] 

VBOX_BLOCK_READ_H[00] 

VBOX_READ_NOP_H[00] 
VBOX_REFERENCE_SIZE_H[00] 
VBOX REFERENCE_TYPE_H[0O] 



OPU-OSQA 


USQ-USQC 


OPU-OSQA 


USQ-USQC 


OPU-OCTL, OQSA, OSQB 


USQ-USQA 


OPU-STG2 


CTL-ISSC 


OPU-STG2 


CTL-ISSC 


OPU-STG2 


CTL-ISSC 


OPU-STG3 


CTL-ISSC 


OPU-STG3 


CTL-ISSC 


OPU-OCTL 


CTL-ISSA 


OPU-OSQA 


USQ-USQA 


OPU-OCTL 


CTL-ISSA 


OPU-OSQB 


CTL-ISSA 


OPU-OSQB 


CTL-QPTR 


OPU-STG2 


MUL-RET0+RET1 


OPU-STG3 


MUL-RET0+RET1 


OPU-STG2 


MUL-RET0+RET1 


OPU-STG3 


MUL-RET0+RET1 


OPU-OSQB 


INT-RLOG 


OPU-OPUA 


DTA-DTMO, DTM1 


OPU-OPUB 


DTA-DTM2, DTM3 


OPU-OPUA 


DTA-DTMO, DTM1 


OPU-OPUB 


DTA-DTM2, DTM3 


OPU-OSQA 


VAP-VAPO 


OPU-OSQA, OPUA, OPUB 


CTU-CTMV 


OPU-STG2 


VBOX-VAD-VMKA 


OPU-STG3 


VBOX-VAD-VMKA 


OPU-STG2 


VBOX-VAD-VMKA 


0PU-STG3 


VBOX-VAD-VMKA 


OPU-OSQA, OSQB 


VBOX-VAD-VMKB 


OPU-OSQA, OSQB 


VBOX-VAD-VMKB 


OPU-OSQA, OSQB 


VBOX-VAD-VMKB 


OPU-OSQA, OSQB 


VBOX-UCS-VCTC 


OPU-OSQA, OSQB 


VBOX-UCS-VCTA 
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Table A-3 (Com.) IBox-OPU Signals 



Output 



IBOX_DATA_H[15:00] 

IBOX_DATA_H[31:16] 

IBOX_DATAJPARITY_H[01:00] 

IBOX_DATA_PARITY_H[03:02] 

IBOX_DATA_ERROR_H{00] 

IBOX_DATA_TAG_H[03:00] 



Origination 



OPU-OPUA 
OPU-OPUB 
OPU-OPUA 
OPU-OPUB 
OPU-OSQB 
OPU-OSQA 



Destination 



EBOX-DST-STGO 

EBOX-DST-STG1 

EBOX-DST-STGO 

EBOX-DST-STG1 

EBOX-CTL-BSE 

EBOX-CTL-ISSA, ISSB, ISSC, 

DST-SRCS 



IBOX_DATA_VALID_H[00] OPU-OSQA 

IBOX_DATA_VALID_L[00] OPU-OSQA 

IBOXJDESTINATION_MEMORY_H[00] OPU-OSQB 

IBOX_DEST1NATION_POINTER_ OPU-OSQB 

H[03:00] 

IBOX_DESTINATION_VALID_H[00] OPU-OSQB 

IBOX_DST_DATA_TAG_H[03:00] OPU-OSQA 

IBOX_DST_GPRJI[03:00] OPU-OSQA 

IBOX_FLUSH_ABORT_H[00] OPU-OCTL 

IBOX_FLUSH_ABORT_L[00] OPU-OCTL 

IBOX_FREE_POINTER_H[03:00] OPU-OSQB 

IBOX_GPR_H[03:00] OPU-OSQA 

IBOX_GPR_WRITE_H[0O] OPU-OSQA 

IBOX_OP_ADDRESS_H[31:16] OPU-OPUB 

IBOX_OP_ADDRESS_H[ 15:00] OPU-OPUA 

IBOX_OP_ADDRESS_PARITY_HIO 1 :00] OPU-OPUA 

IBOX_OP_ADDRESS_PARITY_H[03:02] OPU-OPUB 

IBOX_OP_CONTEXT_H[02:00] OPU-OSQA 

IBOX_OP_CONTROL_H[02:00] OPU-OSQA 

IBOX_OP_CONTROL_PARITY_Ht00] OPU-OSQA 

IBOX_OP_INDIRECT_H[01:00] OPU-OSQA 

IBOX_OP_REQUEST_H[00] OPU-OSQA 

IBOX_OP_TAG_H[03:00] OPU-OSQA 

IBOX_OSQA_ISSA_PARITY_H[00] OPU-OSQA 

IBOX_OSQA_RLOG_PARITY_H[00] OPU-OSQA 



EBOX-CTL-ISSA, ISSB, ISSC, 
DST-SRCS 

EBOX-CTL-ISSA, ISSB, ISSC, 
DST-SRCS 

EBOX-CTL-ISSA, ISSB, ISSC, 

QPTR 

EBOX-CTL-ISSA, ISSB, ISSC, 

QPTR 

EBOX-CTL-ISSA, ISSB, ISSC, 

QPTR 

EBOX-DST-STGO, STG1 

EBOX-DST-STGO, STG1 

MBOX-VAP-VAPO 

MBOX-VAP-CCSQ 

EBOX-ISSA, ISSB, ISSC 

EBOX-DST-SRCS 

EBOX-DST-SRCS 



MBOX 
MBOX 
MBOX 
MBOX 
MBOX 
MBOX 



VAP-FXUP 
VAP-FXUP 
•VAP-VAPO 
•VAP-FXUP 
•VAP-VAPO 
•VAP-VAPO 



MBOX-VAP-VAPO 

MBOX-VAP-VAPO 

MBOX-VAP-VAPO 

MBOX-VAP-VAPO 

EBOX-CTL-ISSA 

EBOX-INT-RLOG 
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A-6 IBox Input and Output Listing 



Table A-3 (Cont.) IBox-OPU Signals 



Output 



Origination 



Destination 



IBOX_OSQB_1SSB_PARITY_H[00] 

IBOX_OSQB_QPTR_PARITYJIIOO] 

IBOX_OSQB_SRCS_PARITY_H[00] 

IBOX_POINTER_ERROR_H[00] 

IBOX_RLOG_COMPLETEJI[00] 

IBOX_RLOG_CONTEXTJI[03:00] 



OPU-OSQB 
OPU-OSQB 
OPU-OSQB 
OPU-OSQB 
OPU-OSQA 
OPU-OSQA 



EBOX 
EBOX 
EBOX 
EBOX 
EBOX 
EBOX 



CTL-ISSB 

CTL-QPTR 

DST-SRCS 

CTL-ISSE 

DST-SRCS 

DST-SRCS 



ffiOX_RLOG_TAG_H[02:00] OPU-OSQA 

IBOX_RLOG_WRITE_H[00] OPU-OSQA 

D3OX_SOURCEl_POINTER_H[04:00] OPU-OSQB 

IBOX_SOURCE1_POINTER_L[04:00] OPU-OSQB 

IBOX_SOURCE1_VALID_H[00] OPU-OSQB 

IBOX_SOURCEl_VALID_L[O0] OPU-OSQB 



EBOX-DST-SRCS 
EBOX-DST-SRCS 
EBOX-CTL-QPTR 
EBOX-CTL-SRCS 
EBOX-CTL-QPTR 
EBOX-CTL-SRCS 



IBOX_SOURCE2_POINTER_H[04:00] OPU-OSQB 

IBOX_SOURCE2_POINTER_U04:00] OPU-OSQB 

IBOX_SOURCE2_VALID_H[O0] OPU-OSQB 

IBOX_SOURCE2_VALID_L[00] OPU-OSQB 



EBOX-CTL-QPTR 
EBOX-CTL-SRCS 
EBOX-CTL-QPTR 
EBOX-CTL-SRCS 
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Index 



B 

Branch bias, 4-30 
Branch prediction, 4-27 
demote, 4-29 

primary prediction hit, 4-28 
primary predictions, 4-27 
secondary predictions, 4—30 
Branch prediction cache 
match enable, 4-29 



Complex specifier unit (CSU), 5-4 
adder, 5-8 
AMUX, 5-7 
BMUX, 5-7 

CSU microcode control, 5-12 
current PC generation, 5-10 
OPUA data path, 5-5 
OPUB data path, 5-8 



Instruction buffer, 3-9 

IBEX, 3-11 

IBEX2, 3-10 

IBUF, 3-18 

instruction buffer parity, 3-19 

merger, 3-14 

rotator, 3-11 

shifter, 3-16 
Instruction buffer interface, 3-22 

aborting requests, 3-23 

page faults, 3-23 
Instruction decode, 4-1 
Instruction fetch, 3-1 to 3-23 
Intra-instruction read conflict 

XBAR IRC logic, 4-23 
Intra-instruction read conflicts (IRC), 
4-23 



M 

MCU, 1-5 
OPU, 1-8 
VIC, 1-6 
XBR, 1-7 



EBox-to-IBox interface, 5-44 



Free pointer logic (FPL), 5-25 
destination pointer, 5-30 
free pointer, 5-29 
free pointer initialization, 5-30 
source 1, 5-25 
source 2, 5-27 



Operand control (OCTL), 5-32 

flush, 5-35 

read and write register mask parity, 
5-35 

read and write register masks, 5-32 

stalls, 5-37 
OPU port interface, 5-38 



I 

IBox error registers, 6-1 
decode error register 1, 6-5 
fetch error register 1, 6-1 
fetch error register 2, 6-4 
specifier error register 1, 6-8 
specifier error register 2, 6-10 
XBAR decode error register, 6-6 

IBox-to-EBox interface, 5-40 



PCU microcode, 4-31 

Pipeline description, 1-9 to 1-12 

Program counter, 2-1 

branch PC, 2-6 

branch PC data path, 

cache control, 2—8 

decode PC, 2-4 

decode PC data path, 

PCU errors, 2-9 

prefetch PC, 2-1 



2-6 



2-5 



Index 1 



2 Index 



Program counter (cont'd.) 
prefetch PC data path, 2-3 
target PC, 2-4 
unwind PC, 2-7 
unwind PC data path, 2-7 



Short literal specifier handler (SLU), 
5-19 

block diagram, 5-19 

floating-point expansion, 5-22 

integer expansion, 5-21 

SLU output, 5-24 

SLU parity protection, 5-24 

stalls, 5-24 
Specifier decode, 5-1 



VBox interface, 5-47 
Virtual instruction cache, 
disabling VIC hit, 3-8 



3-1 



VIC flush, 3-7 
VIC hit, 3-2 
VIC parity, 3-8 
VIC read, 3-7 
VIC write, 3-2 



X 

XBAR, 4-1 

decode tree logic, 4—7 

DRAM, 4-4 

fork logic, 4-14 

request logic, 4-9 

shift counts, 4-13 

simple decode, 4-6 

specifier counts, 4-12 

XBAR displacement data path, 4-14 

XBAR IRC logic, 4-23 

XBAR short literal data path, 4-16 

XBAR source and destination logic, 

4-18 
XRAM, 4-5 



